drbd.conf-9.0(5)

1DRBD.CONF(5)                  Configuration Files                 DRBD.CONF(5)
2
3
4

NAME

6       drbd.conf - DRBD Configuration Files
7

INTRODUCTION

9       DRBD implements block devices which replicate their data to all nodes
10       of a cluster. The actual data and associated metadata are usually
11       stored redundantly on "ordinary" block devices on each cluster node.
12
13       Replicated block devices are called /dev/drbdminor by default. They are
14       grouped into resources, with one or more devices per resource.
15       Replication among the devices in a resource takes place in
16       chronological order. With DRBD, we refer to the devices inside a
17       resource as volumes.
18
19       In DRBD 9, a resource can be replicated between two or more cluster
20       nodes. The connections between cluster nodes are point-to-point links,
21       and use TCP or a TCP-like protocol. All nodes must be directly
22       connected.
23
24       DRBD consists of low-level user-space components which interact with
25       the kernel and perform basic operations (drbdsetup, drbdmeta), a
26       high-level user-space component which understands and processes the
27       DRBD configuration and translates it into basic operations of the
28       low-level components (drbdadm), and a kernel component.
29
30       The default DRBD configuration consists of /etc/drbd.conf and of
31       additional files included from there, usually global_common.conf and
32       all *.res files inside /etc/drbd.d/. It has turned out to be useful to
33       define each resource in a separate *.res file.
34
35       The configuration files are designed so that each cluster node can
36       contain an identical copy of the entire cluster configuration. The host
37       name of each node determines which parts of the configuration apply
38       (uname -n). It is highly recommended to keep the cluster configuration
39       on all nodes in sync by manually copying it to all nodes, or by
40       automating the process with csync2 or a similar tool.
41

EXAMPLE CONFIGURATION FILE

43           global {
44                usage-count yes;
45                udev-always-use-vnr;
46           }
47           resource r0 {
48                 net {
49                      cram-hmac-alg sha1;
50                      shared-secret "FooFunFactory";
51                 }
52                 volume 0 {
53                      device    "/dev/drbd1";
54                      disk      "/dev/sda7";
55                      meta-disk internal;
56                 }
57                 on "alice" {
58                      node-id   0;
59                      address   10.1.1.31:7000;
60                 }
61                 on "bob" {
62                      node-id   1;
63                      address   10.1.1.32:7000;
64                 }
65                 connection {
66                      host      "alice"  port 7000;
67                      host      "bob"    port 7000;
68                      net {
69                          protocol C;
70                      }
71                 }
72           }
73
74       This example defines a resource r0 which contains a single replicated
75       device with volume number 0. The resource is replicated among hosts
76       alice and bob, which have the IPv4 addresses 10.1.1.31 and 10.1.1.32
77       and the node identifiers 0 and 1, respectively. On both hosts, the
78       replicated device is called /dev/drbd1, and the actual data and
79       metadata are stored on the lower-level device /dev/sda7. The connection
80       between the hosts uses protocol C.
81
82       Enclose strings within double-quotation marks (") to differentiate them
83       from resource keywords. Please refer to the DRBD User's Guide[1] for
84       more examples.
85

FILE FORMAT

87       DRBD configuration files consist of sections, which contain other
88       sections and parameters depending on the section types. Each section
89       consists of one or more keywords, sometimes a section name, an opening
90       brace (“{”), the section's contents, and a closing brace (“}”).
91       Parameters inside a section consist of a keyword, followed by one or
92       more keywords or values, and a semicolon (“;”).
93
94       Some parameter values have a default scale which applies when a plain
95       number is specified (for example Kilo, or 1024 times the numeric
96       value). Such default scales can be overridden by using a suffix (for
97       example, M for Mega). The common suffixes K = 2^10 = 1024, M = 1024 K,
98       and G = 1024 M are supported.
99
100       Comments start with a hash sign (“#”) and extend to the end of the
101       line. In addition, any section can be prefixed with the keyword skip,
102       which causes the section and any sub-sections to be ignored.
103
104       Additional files can be included with the include file-pattern
105       statement (see glob(7) for the expressions supported in file-pattern).
106       Include statements are only allowed outside of sections.
107
108       The following sections are defined (indentation indicates in which
109       context):
110
111           common
112              [disk]
113              [handlers]
114              [net]
115              [options]
116              [startup]
117           global
118           [require-drbd-module-version-{eq,ne,gt,ge,lt,le}]
119           resource
120              connection
121                 multiple path | 2 host
122                 [net]
123                 [volume]
124                    [peer-device-options]
125                 [peer-device-options]
126              connection-mesh
127                 [net]
128              [disk]
129              floating
130              handlers
131              [net]
132              on
133                 volume
134                    disk
135                    [disk]
136              options
137              stacked-on-top-of
138              startup
139
140       Sections in brackets affect other parts of the configuration: inside
141       the common section, they apply to all resources. A disk section inside
142       a resource or on section applies to all volumes of that resource, and a
143       net section inside a resource section applies to all connections of
144       that resource. This allows to avoid repeating identical options for
145       each resource, connection, or volume. Options can be overridden in a
146       more specific resource, connection, on, or volume section.
147
148       peer-device-options are resync-rate, c-plan-ahead, c-delay-target,
149       c-fill-target, c-max-rate and c-min-rate. Due to backward
150       comapatibility they can be specified in any disk options section as
151       well. They are inherited into all relevant connections. If they are
152       given on connection level they are inherited to all volumes on that
153       connection. A peer-device-options section is started with the disk
154       keyword.
155
156   Sections
157       common
158
159           This section can contain each a disk, handlers, net, options, and
160           startup section. All resources inherit the parameters in these
161           sections as their default values.
162
163       connection [name]
164
165           Define a connection between two hosts. This section must contain
166           two host parameters or multiple path sections. The optional name is
167           used to refer to the connection in the system log and in other
168           messages. If no name is specified, the peer's host name is used
169           instead.
170
171       path
172
173           Define a path between two hosts. This section must contain two host
174           parameters.
175
176       connection-mesh
177
178           Define a connection mesh between multiple hosts. This section must
179           contain a hosts parameter, which has the host names as arguments.
180           This section is a shortcut to define many connections which share
181           the same network options.
182
183       disk
184
185           Define parameters for a volume. All parameters in this section are
186           optional.
187
188       floating [address-family] addr:port
189
190           Like the on section, except that instead of the host name a network
191           address is used to determine if it matches a floating section.
192
193           The node-id parameter in this section is required. If the address
194           parameter is not provided, no connections to peers will be created
195           by default. The device, disk, and meta-disk parameters must be
196           defined in, or inherited by, this section.
197
198       global
199
200           Define some global parameters. All parameters in this section are
201           optional. Only one global section is allowed in the configuration.
202
203       require-drbd-module-version-{eq,ne,gt,ge,lt,le}
204
205           This statement contains one of the valid forms and a three digit
206           version number (e.g., require-drbd-module-version-eq 9.0.16;). If
207           the currently loaded DRBD kernel module does not match the
208           specification, parsing is aborted. Comparison operator names have
209           same semantic as in test(1).
210
211       handlers
212
213           Define handlers to be invoked when certain events occur. The kernel
214           passes the resource name in the first command-line argument and
215           sets the following environment variables depending on the event's
216           context:
217
218           •   For events related to a particular device: the device's minor
219               number in DRBD_MINOR, the device's volume number in
220               DRBD_VOLUME.
221
222           •   For events related to a particular device on a particular peer:
223               the connection endpoints in DRBD_MY_ADDRESS, DRBD_MY_AF,
224               DRBD_PEER_ADDRESS, and DRBD_PEER_AF; the device's local minor
225               number in DRBD_MINOR, and the device's volume number in
226               DRBD_VOLUME.
227
228           •   For events related to a particular connection: the connection
229               endpoints in DRBD_MY_ADDRESS, DRBD_MY_AF, DRBD_PEER_ADDRESS,
230               and DRBD_PEER_AF; and, for each device defined for that
231               connection: the device's minor number in
232               DRBD_MINOR_volume-number.
233
234           •   For events that identify a device, if a lower-level device is
235               attached, the lower-level device's device name is passed in
236               DRBD_BACKING_DEV (or DRBD_BACKING_DEV_volume-number).
237
238           All parameters in this section are optional. Only a single handler
239           can be defined for each event; if no handler is defined, nothing
240           will happen.
241
242       net
243
244           Define parameters for a connection. All parameters in this section
245           are optional.
246
247       on host-name [...]
248
249           Define the properties of a resource on a particular host or set of
250           hosts. Specifying more than one host name can make sense in a setup
251           with IP address failover, for example. The host-name argument must
252           match the Linux host name (uname -n).
253
254           Usually contains or inherits at least one volume section. The
255           node-id and address parameters must be defined in this section. The
256           device, disk, and meta-disk parameters must be defined in, or
257           inherited by, this section.
258
259           A normal configuration file contains two or more on sections for
260           each resource. Also see the floating section.
261
262       options
263
264           Define parameters for a resource. All parameters in this section
265           are optional.
266
267       resource name
268
269           Define a resource. Usually contains at least two on sections and at
270           least one connection section.
271
272       stacked-on-top-of resource
273
274           Used instead of an on section for configuring a stacked resource
275           with three to four nodes.
276
277           Starting with DRBD 9, stacking is deprecated. It is advised to use
278           resources which are replicated among more than two nodes instead.
279
280       startup
281
282           The parameters in this section determine the behavior of a resource
283           at startup time.
284
285       volume volume-number
286
287           Define a volume within a resource. The volume numbers in the
288           various volume sections of a resource define which devices on which
289           hosts form a replicated device.
290
291   Section connection Parameters
292       host name [address [address-family] address] [port port-number]
293
294           Defines an endpoint for a connection. Each host statement refers to
295           an on section in a resource. If a port number is defined, this
296           endpoint will use the specified port instead of the port defined in
297           the on section. Each connection section must contain exactly two
298           host parameters. Instead of two host parameters the connection may
299           contain multiple path sections.
300
301   Section path Parameters
302       host name [address [address-family] address] [port port-number]
303
304           Defines an endpoint for a connection. Each host statement refers to
305           an on section in a resource. If a port number is defined, this
306           endpoint will use the specified port instead of the port defined in
307           the on section. Each path section must contain exactly two host
308           parameters.
309
310   Section connection-mesh Parameters
311       hosts name...
312
313           Defines all nodes of a mesh. Each name refers to an on section in a
314           resource. The port that is defined in the on section will be used.
315
316   Section disk Parameters
317       al-extents extents
318
319           DRBD automatically maintains a "hot" or "active" disk area likely
320           to be written to again soon based on the recent write activity. The
321           "active" disk area can be written to immediately, while "inactive"
322           disk areas must be "activated" first, which requires a meta-data
323           write. We also refer to this active disk area as the "activity
324           log".
325
326           The activity log saves meta-data writes, but the whole log must be
327           resynced upon recovery of a failed node. The size of the activity
328           log is a major factor of how long a resync will take and how fast a
329           replicated disk will become consistent after a crash.
330
331           The activity log consists of a number of 4-Megabyte segments; the
332           al-extents parameter determines how many of those segments can be
333           active at the same time. The default value for al-extents is 1237,
334           with a minimum of 7 and a maximum of 65536.
335
336           Note that the effective maximum may be smaller, depending on how
337           you created the device meta data, see also drbdmeta(8) The
338           effective maximum is 919 * (available on-disk activity-log
339           ring-buffer area/4kB -1), the default 32kB ring-buffer effects a
340           maximum of 6433 (covers more than 25 GiB of data) We recommend to
341           keep this well within the amount your backend storage and
342           replication link are able to resync inside of about 5 minutes.
343
344       al-updates {yes | no}
345
346           With this parameter, the activity log can be turned off entirely
347           (see the al-extents parameter). This will speed up writes because
348           fewer meta-data writes will be necessary, but the entire device
349           needs to be resynchronized opon recovery of a failed primary node.
350           The default value for al-updates is yes.
351
352       disk-barrier,
353       disk-flushes,
354       disk-drain
355           DRBD has three methods of handling the ordering of dependent write
356           requests:
357
358           disk-barrier
359               Use disk barriers to make sure that requests are written to
360               disk in the right order. Barriers ensure that all requests
361               submitted before a barrier make it to the disk before any
362               requests submitted after the barrier. This is implemented using
363               'tagged command queuing' on SCSI devices and 'native command
364               queuing' on SATA devices. Only some devices and device stacks
365               support this method. The device mapper (LVM) only supports
366               barriers in some configurations.
367
368               Note that on systems which do not support disk barriers,
369               enabling this option can lead to data loss or corruption. Until
370               DRBD 8.4.1, disk-barrier was turned on if the I/O stack below
371               DRBD did support barriers. Kernels since linux-2.6.36 (or
372               2.6.32 RHEL6) no longer allow to detect if barriers are
373               supported. Since drbd-8.4.2, this option is off by default and
374               needs to be enabled explicitly.
375
376           disk-flushes
377               Use disk flushes between dependent write requests, also
378               referred to as 'force unit access' by drive vendors. This
379               forces all data to disk. This option is enabled by default.
380
381           disk-drain
382               Wait for the request queue to "drain" (that is, wait for the
383               requests to finish) before submitting a dependent write
384               request. This method requires that requests are stable on disk
385               when they finish. Before DRBD 8.0.9, this was the only method
386               implemented. This option is enabled by default. Do not disable
387               in production environments.
388
389           From these three methods, drbd will use the first that is enabled
390           and supported by the backing storage device. If all three of these
391           options are turned off, DRBD will submit write requests without
392           bothering about dependencies. Depending on the I/O stack, write
393           requests can be reordered, and they can be submitted in a different
394           order on different cluster nodes. This can result in data loss or
395           corruption. Therefore, turning off all three methods of controlling
396           write ordering is strongly discouraged.
397
398           A general guideline for configuring write ordering is to use disk
399           barriers or disk flushes when using ordinary disks (or an ordinary
400           disk array) with a volatile write cache. On storage without cache
401           or with a battery backed write cache, disk draining can be a
402           reasonable choice.
403
404       disk-timeout
405           If the lower-level device on which a DRBD device stores its data
406           does not finish an I/O request within the defined disk-timeout,
407           DRBD treats this as a failure. The lower-level device is detached,
408           and the device's disk state advances to Diskless. If DRBD is
409           connected to one or more peers, the failed request is passed on to
410           one of them.
411
412           This option is dangerous and may lead to kernel panic!
413
414           "Aborting" requests, or force-detaching the disk, is intended for
415           completely blocked/hung local backing devices which do no longer
416           complete requests at all, not even do error completions. In this
417           situation, usually a hard-reset and failover is the only way out.
418
419           By "aborting", basically faking a local error-completion, we allow
420           for a more graceful swichover by cleanly migrating services. Still
421           the affected node has to be rebooted "soon".
422
423           By completing these requests, we allow the upper layers to re-use
424           the associated data pages.
425
426           If later the local backing device "recovers", and now DMAs some
427           data from disk into the original request pages, in the best case it
428           will just put random data into unused pages; but typically it will
429           corrupt meanwhile completely unrelated data, causing all sorts of
430           damage.
431
432           Which means delayed successful completion, especially for READ
433           requests, is a reason to panic(). We assume that a delayed *error*
434           completion is OK, though we still will complain noisily about it.
435
436           The default value of disk-timeout is 0, which stands for an
437           infinite timeout. Timeouts are specified in units of 0.1 seconds.
438           This option is available since DRBD 8.3.12.
439
440       md-flushes
441           Enable disk flushes and disk barriers on the meta-data device. This
442           option is enabled by default. See the disk-flushes parameter.
443
444       on-io-error handler
445
446           Configure how DRBD reacts to I/O errors on a lower-level device.
447           The following policies are defined:
448
449           pass_on
450               Change the disk status to Inconsistent, mark the failed block
451               as inconsistent in the bitmap, and retry the I/O operation on a
452               remote cluster node.
453
454           call-local-io-error
455               Call the local-io-error handler (see the handlers section).
456
457           detach
458               Detach the lower-level device and continue in diskless mode.
459
460
461       read-balancing policy
462           Distribute read requests among cluster nodes as defined by policy.
463           The supported policies are prefer-local (the default),
464           prefer-remote, round-robin, least-pending, when-congested-remote,
465           32K-striping, 64K-striping, 128K-striping, 256K-striping,
466           512K-striping and 1M-striping.
467
468           This option is available since DRBD 8.4.1.
469
470       resync-after res-name/volume
471
472           Define that a device should only resynchronize after the specified
473           other device. By default, no order between devices is defined, and
474           all devices will resynchronize in parallel. Depending on the
475           configuration of the lower-level devices, and the available network
476           and disk bandwidth, this can slow down the overall resync process.
477           This option can be used to form a chain or tree of dependencies
478           among devices.
479
480       rs-discard-granularity byte
481           When rs-discard-granularity is set to a non zero, positive value
482           then DRBD tries to do a resync operation in requests of this size.
483           In case such a block contains only zero bytes on the sync source
484           node, the sync target node will issue a discard/trim/unmap command
485           for the area.
486
487           The value is constrained by the discard granularity of the backing
488           block device. In case rs-discard-granularity is not a multiplier of
489           the discard granularity of the backing block device DRBD rounds it
490           up. The feature only gets active if the backing block device reads
491           back zeroes after a discard command.
492
493           The usage of rs-discard-granularity may cause c-max-rate to be
494           exceeded. In particular, the resync rate may reach 10x the value of
495           rs-discard-granularity per second.
496
497           The default value of rs-discard-granularity is 0. This option is
498           available since 8.4.7.
499
500       discard-zeroes-if-aligned {yes | no}
501
502           There are several aspects to discard/trim/unmap support on linux
503           block devices. Even if discard is supported in general, it may fail
504           silently, or may partially ignore discard requests. Devices also
505           announce whether reading from unmapped blocks returns defined data
506           (usually zeroes), or undefined data (possibly old data, possibly
507           garbage).
508
509           If on different nodes, DRBD is backed by devices with differing
510           discard characteristics, discards may lead to data divergence (old
511           data or garbage left over on one backend, zeroes due to unmapped
512           areas on the other backend). Online verify would now potentially
513           report tons of spurious differences. While probably harmless for
514           most use cases (fstrim on a file system), DRBD cannot have that.
515
516           To play safe, we have to disable discard support, if our local
517           backend (on a Primary) does not support "discard_zeroes_data=true".
518           We also have to translate discards to explicit zero-out on the
519           receiving side, unless the receiving side (Secondary) supports
520           "discard_zeroes_data=true", thereby allocating areas what were
521           supposed to be unmapped.
522
523           There are some devices (notably the LVM/DM thin provisioning) that
524           are capable of discard, but announce discard_zeroes_data=false. In
525           the case of DM-thin, discards aligned to the chunk size will be
526           unmapped, and reading from unmapped sectors will return zeroes.
527           However, unaligned partial head or tail areas of discard requests
528           will be silently ignored.
529
530           If we now add a helper to explicitly zero-out these unaligned
531           partial areas, while passing on the discard of the aligned full
532           chunks, we effectively achieve discard_zeroes_data=true on such
533           devices.
534
535           Setting discard-zeroes-if-aligned to yes will allow DRBD to use
536           discards, and to announce discard_zeroes_data=true, even on
537           backends that announce discard_zeroes_data=false.
538
539           Setting discard-zeroes-if-aligned to no will cause DRBD to always
540           fall-back to zero-out on the receiving side, and to not even
541           announce discard capabilities on the Primary, if the respective
542           backend announces discard_zeroes_data=false.
543
544           We used to ignore the discard_zeroes_data setting completely. To
545           not break established and expected behaviour, and suddenly cause
546           fstrim on thin-provisioned LVs to run out-of-space instead of
547           freeing up space, the default value is yes.
548
549           This option is available since 8.4.7.
550
551       disable-write-same {yes | no}
552
553           Some disks announce WRITE_SAME support to the kernel but fail with
554           an I/O error upon actually receiving such a request. This mostly
555           happens when using virtualized disks -- notably, this behavior has
556           been observed with VMware's virtual disks.
557
558           When disable-write-same is set to yes, WRITE_SAME detection is
559           manually overriden and support is disabled.
560
561           The default value of disable-write-same is no. This option is
562           available since 8.4.7.
563
564   Section peer-device-options Parameters
565       Please note that you open the section with the disk keyword.
566
567       c-delay-target delay_target,
568       c-fill-target fill_target,
569       c-max-rate max_rate,
570       c-plan-ahead plan_time
571           Dynamically control the resync speed. The following modes are
572           available:
573
574           •   Dynamic control with fill target (default). Enabled when
575               c-plan-ahead is non-zero and c-fill-target is non-zero. The
576               goal is to fill the buffers along the data path with a defined
577               amount of data. This mode is recommended when DRBD-proxy is
578               used. Configured with c-plan-ahead, c-fill-target and
579               c-max-rate.
580
581           •   Dynamic control with delay target. Enabled when c-plan-ahead is
582               non-zero (default) and c-fill-target is zero. The goal is to
583               have a defined delay along the path. Configured with
584               c-plan-ahead, c-delay-target and c-max-rate.
585
586           •   Fixed resync rate. Enabled when c-plan-ahead is zero. DRBD will
587               try to perform resync I/O at a fixed rate. Configured with
588               resync-rate.
589
590           The c-plan-ahead parameter defines how fast DRBD adapts to changes
591           in the resync speed. It should be set to five times the network
592           round-trip time or more. The default value of c-plan-ahead is 20,
593           in units of 0.1 seconds.
594
595           The c-fill-target parameter defines the how much resync data DRBD
596           should aim to have in-flight at all times. Common values for
597           "normal" data paths range from 4K to 100K. The default value of
598           c-fill-target is 100, in units of sectors
599
600           The c-delay-target parameter defines the delay in the resync path
601           that DRBD should aim for. This should be set to five times the
602           network round-trip time or more. The default value of
603           c-delay-target is 10, in units of 0.1 seconds.
604
605           The c-max-rate parameter limits the maximum bandwidth used by
606           dynamically controlled resyncs. Setting this to zero removes the
607           limitation (since DRBD 9.0.28). It should be set to either the
608           bandwidth available between the DRBD hosts and the machines hosting
609           DRBD-proxy, or to the available disk bandwidth. The default value
610           of c-max-rate is 102400, in units of KiB/s.
611
612           Dynamic resync speed control is available since DRBD 8.3.9.
613
614       c-min-rate min_rate
615           A node which is primary and sync-source has to schedule application
616           I/O requests and resync I/O requests. The c-min-rate parameter
617           limits how much bandwidth is available for resync I/O; the
618           remaining bandwidth is used for application I/O.
619
620           A c-min-rate value of 0 means that there is no limit on the resync
621           I/O bandwidth. This can slow down application I/O significantly.
622           Use a value of 1 (1 KiB/s) for the lowest possible resync rate.
623
624           The default value of c-min-rate is 250, in units of KiB/s.
625
626       resync-rate rate
627
628           Define how much bandwidth DRBD may use for resynchronizing. DRBD
629           allows "normal" application I/O even during a resync. If the resync
630           takes up too much bandwidth, application I/O can become very slow.
631           This parameter allows to avoid that. Please note this is option
632           only works when the dynamic resync controller is disabled.
633
634   Section global Parameters
635       dialog-refresh time
636
637           The DRBD init script can be used to configure and start DRBD
638           devices, which can involve waiting for other cluster nodes. While
639           waiting, the init script shows the remaining waiting time. The
640           dialog-refresh defines the number of seconds between updates of
641           that countdown. The default value is 1; a value of 0 turns off the
642           countdown.
643
644       disable-ip-verification
645           Normally, DRBD verifies that the IP addresses in the configuration
646           match the host names. Use the disable-ip-verification parameter to
647           disable these checks.
648
649       usage-count {yes | no | ask}
650           A explained on DRBD's Online Usage Counter[2] web page, DRBD
651           includes a mechanism for anonymously counting how many
652           installations are using which versions of DRBD. The results are
653           available on the web page for anyone to see.
654
655           This parameter defines if a cluster node participates in the usage
656           counter; the supported values are yes, no, and ask (ask the user,
657           the default).
658
659           We would like to ask users to participate in the online usage
660           counter as this provides us valuable feedback for steering the
661           development of DRBD.
662
663       udev-always-use-vnr
664           When udev asks drbdadm for a list of device related symlinks,
665           drbdadm would suggest symlinks with differing naming conventions,
666           depending on whether the resource has explicit volume VNR { }
667           definitions, or only one single volume with the implicit volume
668           number 0:
669
670               # implicit single volume without "volume 0 {}" block
671               DEVICE=drbd<minor>
672               SYMLINK_BY_RES=drbd/by-res/<resource-name>
673               SYMLINK_BY_DISK=drbd/by-disk/<backing-disk-name>
674
675               # explicit volume definition: volume VNR { }
676               DEVICE=drbd<minor>
677               SYMLINK_BY_RES=drbd/by-res/<resource-name>/VNR
678               SYMLINK_BY_DISK=drbd/by-disk/<backing-disk-name>
679
680           If you define this parameter in the global section, drbdadm will
681           always add the .../VNR part, and will not care for whether the
682           volume definition was implicit or explicit.
683
684           For legacy backward compatibility, this is off by default, but we
685           do recommend to enable it.
686
687   Section handlers Parameters
688       after-resync-target cmd
689
690           Called on a resync target when a node state changes from
691           Inconsistent to Consistent when a resync finishes. This handler can
692           be used for removing the snapshot created in the
693           before-resync-target handler.
694
695       before-resync-target cmd
696
697           Called on a resync target before a resync begins. This handler can
698           be used for creating a snapshot of the lower-level device for the
699           duration of the resync: if the resync source becomes unavailable
700           during a resync, reverting to the snapshot can restore a consistent
701           state.
702
703       before-resync-source cmd
704
705           Called on a resync source before a resync begins.
706
707       out-of-sync cmd
708
709           Called on all nodes after a verify finishes and out-of-sync blocks
710           were found. This handler is mainly used for monitoring purposes. An
711           example would be to call a script that sends an alert SMS.
712
713       quorum-lost cmd
714
715           Called on a Primary that lost quorum. This handler is usually used
716           to reboot the node if it is not possible to restart the application
717           that uses the storage on top of DRBD.
718
719       fence-peer cmd
720
721           Called when a node should fence a resource on a particular peer.
722           The handler should not use the same communication path that DRBD
723           uses for talking to the peer.
724
725       unfence-peer cmd
726
727           Called when a node should remove fencing constraints from other
728           nodes.
729
730       initial-split-brain cmd
731
732           Called when DRBD connects to a peer and detects that the peer is in
733           a split-brain state with the local node. This handler is also
734           called for split-brain scenarios which will be resolved
735           automatically.
736
737       local-io-error cmd
738
739           Called when an I/O error occurs on a lower-level device.
740
741       pri-lost cmd
742
743           The local node is currently primary, but DRBD believes that it
744           should become a sync target. The node should give up its primary
745           role.
746
747       pri-lost-after-sb cmd
748
749           The local node is currently primary, but it has lost the
750           after-split-brain auto recovery procedure. The node should be
751           abandoned.
752
753       pri-on-incon-degr cmd
754
755           The local node is primary, and neither the local lower-level device
756           nor a lower-level device on a peer is up to date. (The primary has
757           no device to read from or to write to.)
758
759       split-brain cmd
760
761           DRBD has detected a split-brain situation which could not be
762           resolved automatically. Manual recovery is necessary. This handler
763           can be used to call for administrator attention.
764
765       disconnected cmd
766
767           A connection to a peer went down. The handler can learn about the
768           reason for the disconnect from the DRBD_CSTATE environment
769           variable.
770
771   Section net Parameters
772       after-sb-0pri policy
773           Define how to react if a split-brain scenario is detected and none
774           of the two nodes is in primary role. (We detect split-brain
775           scenarios when two nodes connect; split-brain decisions are always
776           between two nodes.) The defined policies are:
777
778           disconnect
779               No automatic resynchronization; simply disconnect.
780
781           discard-younger-primary,
782           discard-older-primary
783               Resynchronize from the node which became primary first
784               (discard-younger-primary) or last (discard-older-primary). If
785               both nodes became primary independently, the
786               discard-least-changes policy is used.
787
788           discard-zero-changes
789               If only one of the nodes wrote data since the split brain
790               situation was detected, resynchronize from this node to the
791               other. If both nodes wrote data, disconnect.
792
793           discard-least-changes
794               Resynchronize from the node with more modified blocks.
795
796           discard-node-nodename
797               Always resynchronize to the named node.
798
799       after-sb-1pri policy
800           Define how to react if a split-brain scenario is detected, with one
801           node in primary role and one node in secondary role. (We detect
802           split-brain scenarios when two nodes connect, so split-brain
803           decisions are always among two nodes.) The defined policies are:
804
805           disconnect
806               No automatic resynchronization, simply disconnect.
807
808           consensus
809               Discard the data on the secondary node if the after-sb-0pri
810               algorithm would also discard the data on the secondary node.
811               Otherwise, disconnect.
812
813           violently-as0p
814               Always take the decision of the after-sb-0pri algorithm, even
815               if it causes an erratic change of the primary's view of the
816               data. This is only useful if a single-node file system (i.e.,
817               not OCFS2 or GFS) with the allow-two-primaries flag is used.
818               This option can cause the primary node to crash, and should not
819               be used.
820
821           discard-secondary
822               Discard the data on the secondary node.
823
824           call-pri-lost-after-sb
825               Always take the decision of the after-sb-0pri algorithm. If the
826               decision is to discard the data on the primary node, call the
827               pri-lost-after-sb handler on the primary node.
828
829       after-sb-2pri policy
830           Define how to react if a split-brain scenario is detected and both
831           nodes are in primary role. (We detect split-brain scenarios when
832           two nodes connect, so split-brain decisions are always among two
833           nodes.) The defined policies are:
834
835           disconnect
836               No automatic resynchronization, simply disconnect.
837
838           violently-as0p
839               See the violently-as0p policy for after-sb-1pri.
840
841           call-pri-lost-after-sb
842               Call the pri-lost-after-sb helper program on one of the
843               machines unless that machine can demote to secondary. The
844               helper program is expected to reboot the machine, which brings
845               the node into a secondary role. Which machine runs the helper
846               program is determined by the after-sb-0pri strategy.
847
848       allow-two-primaries
849
850           The most common way to configure DRBD devices is to allow only one
851           node to be primary (and thus writable) at a time.
852
853           In some scenarios it is preferable to allow two nodes to be primary
854           at once; a mechanism outside of DRBD then must make sure that
855           writes to the shared, replicated device happen in a coordinated
856           way. This can be done with a shared-storage cluster file system
857           like OCFS2 and GFS, or with virtual machine images and a virtual
858           machine manager that can migrate virtual machines between physical
859           machines.
860
861           The allow-two-primaries parameter tells DRBD to allow two nodes to
862           be primary at the same time. Never enable this option when using a
863           non-distributed file system; otherwise, data corruption and node
864           crashes will result!
865
866       always-asbp
867           Normally the automatic after-split-brain policies are only used if
868           current states of the UUIDs do not indicate the presence of a third
869           node.
870
871           With this option you request that the automatic after-split-brain
872           policies are used as long as the data sets of the nodes are somehow
873           related. This might cause a full sync, if the UUIDs indicate the
874           presence of a third node. (Or double faults led to strange UUID
875           sets.)
876
877       connect-int time
878
879           As soon as a connection between two nodes is configured with
880           drbdsetup connect, DRBD immediately tries to establish the
881           connection. If this fails, DRBD waits for connect-int seconds and
882           then repeats. The default value of connect-int is 10 seconds.
883
884       cram-hmac-alg hash-algorithm
885
886           Configure the hash-based message authentication code (HMAC) or
887           secure hash algorithm to use for peer authentication. The kernel
888           supports a number of different algorithms, some of which may be
889           loadable as kernel modules. See the shash algorithms listed in
890           /proc/crypto. By default, cram-hmac-alg is unset. Peer
891           authentication also requires a shared-secret to be configured.
892
893       csums-alg hash-algorithm
894
895           Normally, when two nodes resynchronize, the sync target requests a
896           piece of out-of-sync data from the sync source, and the sync source
897           sends the data. With many usage patterns, a significant number of
898           those blocks will actually be identical.
899
900           When a csums-alg algorithm is specified, when requesting a piece of
901           out-of-sync data, the sync target also sends along a hash of the
902           data it currently has. The sync source compares this hash with its
903           own version of the data. It sends the sync target the new data if
904           the hashes differ, and tells it that the data are the same
905           otherwise. This reduces the network bandwidth required, at the cost
906           of higher cpu utilization and possibly increased I/O on the sync
907           target.
908
909           The csums-alg can be set to one of the secure hash algorithms
910           supported by the kernel; see the shash algorithms listed in
911           /proc/crypto. By default, csums-alg is unset.
912
913       csums-after-crash-only
914
915           Enabling this option (and csums-alg, above) makes it possible to
916           use the checksum based resync only for the first resync after
917           primary crash, but not for later "network hickups".
918
919           In most cases, block that are marked as need-to-be-resynced are in
920           fact changed, so calculating checksums, and both reading and
921           writing the blocks on the resync target is all effective overhead.
922
923           The advantage of checksum based resync is mostly after primary
924           crash recovery, where the recovery marked larger areas (those
925           covered by the activity log) as need-to-be-resynced, just in case.
926           Introduced in 8.4.5.
927
928       data-integrity-alg  alg
929           DRBD normally relies on the data integrity checks built into the
930           TCP/IP protocol, but if a data integrity algorithm is configured,
931           it will additionally use this algorithm to make sure that the data
932           received over the network match what the sender has sent. If a data
933           integrity error is detected, DRBD will close the network connection
934           and reconnect, which will trigger a resync.
935
936           The data-integrity-alg can be set to one of the secure hash
937           algorithms supported by the kernel; see the shash algorithms listed
938           in /proc/crypto. By default, this mechanism is turned off.
939
940           Because of the CPU overhead involved, we recommend not to use this
941           option in production environments. Also see the notes on data
942           integrity below.
943
944       fencing fencing_policy
945
946           Fencing is a preventive measure to avoid situations where both
947           nodes are primary and disconnected. This is also known as a
948           split-brain situation. DRBD supports the following fencing
949           policies:
950
951           dont-care
952               No fencing actions are taken. This is the default policy.
953
954           resource-only
955               If a node becomes a disconnected primary, it tries to fence the
956               peer. This is done by calling the fence-peer handler. The
957               handler is supposed to reach the peer over an alternative
958               communication path and call 'drbdadm outdate minor' there.
959
960           resource-and-stonith
961               If a node becomes a disconnected primary, it freezes all its IO
962               operations and calls its fence-peer handler. The fence-peer
963               handler is supposed to reach the peer over an alternative
964               communication path and call 'drbdadm outdate minor' there. In
965               case it cannot do that, it should stonith the peer. IO is
966               resumed as soon as the situation is resolved. In case the
967               fence-peer handler fails, I/O can be resumed manually with
968               'drbdadm resume-io'.
969
970       ko-count number
971
972           If a secondary node fails to complete a write request in ko-count
973           times the timeout parameter, it is excluded from the cluster. The
974           primary node then sets the connection to this secondary node to
975           Standalone. To disable this feature, you should explicitly set it
976           to 0; defaults may change between versions.
977
978       max-buffers number
979
980           Limits the memory usage per DRBD minor device on the receiving
981           side, or for internal buffers during resync or online-verify. Unit
982           is PAGE_SIZE, which is 4 KiB on most systems. The minimum possible
983           setting is hard coded to 32 (=128 KiB). These buffers are used to
984           hold data blocks while they are written to/read from disk. To avoid
985           possible distributed deadlocks on congestion, this setting is used
986           as a throttle threshold rather than a hard limit. Once more than
987           max-buffers pages are in use, further allocation from this pool is
988           throttled. You want to increase max-buffers if you cannot saturate
989           the IO backend on the receiving side.
990
991       max-epoch-size number
992
993           Define the maximum number of write requests DRBD may issue before
994           issuing a write barrier. The default value is 2048, with a minimum
995           of 1 and a maximum of 20000. Setting this parameter to a value
996           below 10 is likely to decrease performance.
997
998       on-congestion policy,
999       congestion-fill threshold,
1000       congestion-extents threshold
1001           By default, DRBD blocks when the TCP send queue is full. This
1002           prevents applications from generating further write requests until
1003           more buffer space becomes available again.
1004
1005           When DRBD is used together with DRBD-proxy, it can be better to use
1006           the pull-ahead on-congestion policy, which can switch DRBD into
1007           ahead/behind mode before the send queue is full. DRBD then records
1008           the differences between itself and the peer in its bitmap, but it
1009           no longer replicates them to the peer. When enough buffer space
1010           becomes available again, the node resynchronizes with the peer and
1011           switches back to normal replication.
1012
1013           This has the advantage of not blocking application I/O even when
1014           the queues fill up, and the disadvantage that peer nodes can fall
1015           behind much further. Also, while resynchronizing, peer nodes will
1016           become inconsistent.
1017
1018           The available congestion policies are block (the default) and
1019           pull-ahead. The congestion-fill parameter defines how much data is
1020           allowed to be "in flight" in this connection. The default value is
1021           0, which disables this mechanism of congestion control, with a
1022           maximum of 10 GiBytes. The congestion-extents parameter defines how
1023           many bitmap extents may be active before switching into
1024           ahead/behind mode, with the same default and limits as the
1025           al-extents parameter. The congestion-extents parameter is effective
1026           only when set to a value smaller than al-extents.
1027
1028           Ahead/behind mode is available since DRBD 8.3.10.
1029
1030       ping-int interval
1031
1032           When the TCP/IP connection to a peer is idle for more than ping-int
1033           seconds, DRBD will send a keep-alive packet to make sure that a
1034           failed peer or network connection is detected reasonably soon. The
1035           default value is 10 seconds, with a minimum of 1 and a maximum of
1036           120 seconds. The unit is seconds.
1037
1038       ping-timeout timeout
1039
1040           Define the timeout for replies to keep-alive packets. If the peer
1041           does not reply within ping-timeout, DRBD will close and try to
1042           reestablish the connection. The default value is 0.5 seconds, with
1043           a minimum of 0.1 seconds and a maximum of 30 seconds. The unit is
1044           tenths of a second.
1045
1046       socket-check-timeout timeout
1047           In setups involving a DRBD-proxy and connections that experience a
1048           lot of buffer-bloat it might be necessary to set ping-timeout to an
1049           unusual high value. By default DRBD uses the same value to wait if
1050           a newly established TCP-connection is stable. Since the DRBD-proxy
1051           is usually located in the same data center such a long wait time
1052           may hinder DRBD's connect process.
1053
1054           In such setups socket-check-timeout should be set to at least to
1055           the round trip time between DRBD and DRBD-proxy. I.e. in most cases
1056           to 1.
1057
1058           The default unit is tenths of a second, the default value is 0
1059           (which causes DRBD to use the value of ping-timeout instead).
1060           Introduced in 8.4.5.
1061
1062       protocol name
1063           Use the specified protocol on this connection. The supported
1064           protocols are:
1065
1066           A
1067               Writes to the DRBD device complete as soon as they have reached
1068               the local disk and the TCP/IP send buffer.
1069
1070           B
1071               Writes to the DRBD device complete as soon as they have reached
1072               the local disk, and all peers have acknowledged the receipt of
1073               the write requests.
1074
1075           C
1076               Writes to the DRBD device complete as soon as they have reached
1077               the local and all remote disks.
1078
1079
1080       rcvbuf-size size
1081
1082           Configure the size of the TCP/IP receive buffer. A value of 0 (the
1083           default) causes the buffer size to adjust dynamically. This
1084           parameter usually does not need to be set, but it can be set to a
1085           value up to 10 MiB. The default unit is bytes.
1086
1087       rr-conflict policy
1088           This option helps to solve the cases when the outcome of the resync
1089           decision is incompatible with the current role assignment in the
1090           cluster. The defined policies are:
1091
1092           disconnect
1093               No automatic resynchronization, simply disconnect.
1094
1095           retry-connect
1096               Disconnect now, and retry to connect immediatly afterwards.
1097
1098           violently
1099               Resync to the primary node is allowed, violating the assumption
1100               that data on a block device are stable for one of the nodes.
1101               Do not use this option, it is dangerous.
1102
1103           call-pri-lost
1104               Call the pri-lost handler on one of the machines. The handler
1105               is expected to reboot the machine, which puts it into secondary
1106               role.
1107
1108           auto-discard
1109               Auto-discard reverses the resync direction, so that DRBD
1110               resyncs the current primary to the current secondary.
1111               Auto-discard only applies when protocol A is in use and the
1112               resync decision is based on the principle that a crashed
1113               primary should be the source of a resync. When a primary node
1114               crashes, it might have written some last updates to its disk,
1115               which were not received by a protocol A secondary. By promoting
1116               the secondary in the meantime the user accepted that those last
1117               updates have been lost. By using auto-discard you consent that
1118               the last updates (before the crash of the primary) should be
1119               rolled back automatically.
1120
1121       shared-secret secret
1122
1123           Configure the shared secret used for peer authentication. The
1124           secret is a string of up to 64 characters. Peer authentication also
1125           requires the cram-hmac-alg parameter to be set.
1126
1127       sndbuf-size size
1128
1129           Configure the size of the TCP/IP send buffer. Since DRBD 8.0.13 /
1130           8.2.7, a value of 0 (the default) causes the buffer size to adjust
1131           dynamically. Values below 32 KiB are harmful to the throughput on
1132           this connection. Large buffer sizes can be useful especially when
1133           protocol A is used over high-latency networks; the maximum value
1134           supported is 10 MiB.
1135
1136       tcp-cork
1137           By default, DRBD uses the TCP_CORK socket option to prevent the
1138           kernel from sending partial messages; this results in fewer and
1139           bigger packets on the network. Some network stacks can perform
1140           worse with this optimization. On these, the tcp-cork parameter can
1141           be used to turn this optimization off.
1142
1143       timeout time
1144
1145           Define the timeout for replies over the network: if a peer node
1146           does not send an expected reply within the specified timeout, it is
1147           considered dead and the TCP/IP connection is closed. The timeout
1148           value must be lower than connect-int and lower than ping-int. The
1149           default is 6 seconds; the value is specified in tenths of a second.
1150
1151       transport type
1152
1153           With DRBD9 the network transport used by DRBD is loaded as a
1154           seperate module. With this option you can specify which transport
1155           and module to load. At present only two options exist, tcp and
1156           rdma. Please note that currently the RDMA transport module is only
1157           available with a license purchased from LINBIT. Default is tcp.
1158
1159       use-rle
1160
1161           Each replicated device on a cluster node has a separate bitmap for
1162           each of its peer devices. The bitmaps are used for tracking the
1163           differences between the local and peer device: depending on the
1164           cluster state, a disk range can be marked as different from the
1165           peer in the device's bitmap, in the peer device's bitmap, or in
1166           both bitmaps. When two cluster nodes connect, they exchange each
1167           other's bitmaps, and they each compute the union of the local and
1168           peer bitmap to determine the overall differences.
1169
1170           Bitmaps of very large devices are also relatively large, but they
1171           usually compress very well using run-length encoding. This can save
1172           time and bandwidth for the bitmap transfers.
1173
1174           The use-rle parameter determines if run-length encoding should be
1175           used. It is on by default since DRBD 8.4.0.
1176
1177       verify-alg hash-algorithm
1178           Online verification (drbdadm verify) computes and compares
1179           checksums of disk blocks (i.e., hash values) in order to detect if
1180           they differ. The verify-alg parameter determines which algorithm to
1181           use for these checksums. It must be set to one of the secure hash
1182           algorithms supported by the kernel before online verify can be
1183           used; see the shash algorithms listed in /proc/crypto.
1184
1185           We recommend to schedule online verifications regularly during
1186           low-load periods, for example once a month. Also see the notes on
1187           data integrity below.
1188
1189       allow-remote-read bool-value
1190           Allows or disallows DRBD to read from a peer node.
1191
1192           When the disk of a primary node is detached, DRBD will try to
1193           continue reading and writing from another node in the cluster. For
1194           this purpose, it searches for nodes with up-to-date data, and uses
1195           any found node to resume operations. In some cases it may not be
1196           desirable to read back data from a peer node, because the node
1197           should only be used as a replication target. In this case, the
1198           allow-remote-read parameter can be set to no, which would prohibit
1199           this node from reading data from the peer node.
1200
1201           The allow-remote-read parameter is available since DRBD 9.0.19, and
1202           defaults to yes.
1203
1204   Section on Parameters
1205       address [address-family] address:port
1206
1207           Defines the address family, address, and port of a connection
1208           endpoint.
1209
1210           The address families ipv4, ipv6, ssocks (Dolphin Interconnect
1211           Solutions' "super sockets"), sdp (Infiniband Sockets Direct
1212           Protocol), and sci are supported (sci is an alias for ssocks). If
1213           no address family is specified, ipv4 is assumed. For all address
1214           families except ipv6, the address is specified in IPV4 address
1215           notation (for example, 1.2.3.4). For ipv6, the address is enclosed
1216           in brackets and uses IPv6 address notation (for example,
1217           [fd01:2345:6789:abcd::1]). The port is always specified as a
1218           decimal number from 1 to 65535.
1219
1220           On each host, the port numbers must be unique for each address;
1221           ports cannot be shared.
1222
1223       node-id value
1224
1225           Defines the unique node identifier for a node in the cluster. Node
1226           identifiers are used to identify individual nodes in the network
1227           protocol, and to assign bitmap slots to nodes in the metadata.
1228
1229           Node identifiers can only be reasssigned in a cluster when the
1230           cluster is down. It is essential that the node identifiers in the
1231           configuration and in the device metadata are changed consistently
1232           on all hosts. To change the metadata, dump the current state with
1233           drbdmeta dump-md, adjust the bitmap slot assignment, and update the
1234           metadata with drbdmeta restore-md.
1235
1236           The node-id parameter exists since DRBD 9. Its value ranges from 0
1237           to 16; there is no default.
1238
1239   Section options Parameters (Resource Options)
1240       auto-promote bool-value
1241           A resource must be promoted to primary role before any of its
1242           devices can be mounted or opened for writing.
1243
1244           Before DRBD 9, this could only be done explicitly ("drbdadm
1245           primary"). Since DRBD 9, the auto-promote parameter allows to
1246           automatically promote a resource to primary role when one of its
1247           devices is mounted or opened for writing. As soon as all devices
1248           are unmounted or closed with no more remaining users, the role of
1249           the resource changes back to secondary.
1250
1251           Automatic promotion only succeeds if the cluster state allows it
1252           (that is, if an explicit drbdadm primary command would succeed).
1253           Otherwise, mounting or opening the device fails as it already did
1254           before DRBD 9: the mount(2) system call fails with errno set to
1255           EROFS (Read-only file system); the open(2) system call fails with
1256           errno set to EMEDIUMTYPE (wrong medium type).
1257
1258           Irrespective of the auto-promote parameter, if a device is promoted
1259           explicitly (drbdadm primary), it also needs to be demoted
1260           explicitly (drbdadm secondary).
1261
1262           The auto-promote parameter is available since DRBD 9.0.0, and
1263           defaults to yes.
1264
1265       cpu-mask cpu-mask
1266
1267           Set the cpu affinity mask for DRBD kernel threads. The cpu mask is
1268           specified as a hexadecimal number. The default value is 0, which
1269           lets the scheduler decide which kernel threads run on which CPUs.
1270           CPU numbers in cpu-mask which do not exist in the system are
1271           ignored.
1272
1273       on-no-data-accessible policy
1274           Determine how to deal with I/O requests when the requested data is
1275           not available locally or remotely (for example, when all disks have
1276           failed). When quorum is enabled, on-no-data-accessible should be
1277           set to the same value as on-no-quorum. The defined policies are:
1278
1279           io-error
1280               System calls fail with errno set to EIO.
1281
1282           suspend-io
1283               The resource suspends I/O. I/O can be resumed by (re)attaching
1284               the lower-level device, by connecting to a peer which has
1285               access to the data, or by forcing DRBD to resume I/O with
1286               drbdadm resume-io res. When no data is available, forcing I/O
1287               to resume will result in the same behavior as the io-error
1288               policy.
1289
1290           This setting is available since DRBD 8.3.9; the default policy is
1291           io-error.
1292
1293       peer-ack-window value
1294
1295           On each node and for each device, DRBD maintains a bitmap of the
1296           differences between the local and remote data for each peer device.
1297           For example, in a three-node setup (nodes A, B, C) each with a
1298           single device, every node maintains one bitmap for each of its
1299           peers.
1300
1301           When nodes receive write requests, they know how to update the
1302           bitmaps for the writing node, but not how to update the bitmaps
1303           between themselves. In this example, when a write request
1304           propagates from node A to B and C, nodes B and C know that they
1305           have the same data as node A, but not whether or not they both have
1306           the same data.
1307
1308           As a remedy, the writing node occasionally sends peer-ack packets
1309           to its peers which tell them which state they are in relative to
1310           each other.
1311
1312           The peer-ack-window parameter specifies how much data a primary
1313           node may send before sending a peer-ack packet. A low value causes
1314           increased network traffic; a high value causes less network traffic
1315           but higher memory consumption on secondary nodes and higher resync
1316           times between the secondary nodes after primary node failures.
1317           (Note: peer-ack packets may be sent due to other reasons as well,
1318           e.g. membership changes or expiry of the peer-ack-delay timer.)
1319
1320           The default value for peer-ack-window is 2 MiB, the default unit is
1321           sectors. This option is available since 9.0.0.
1322
1323       peer-ack-delay expiry-time
1324
1325           If after the last finished write request no new write request gets
1326           issued for expiry-time, then a peer-ack packet is sent. If a new
1327           write request is issued before the timer expires, the timer gets
1328           reset to expiry-time. (Note: peer-ack packets may be sent due to
1329           other reasons as well, e.g. membership changes or the
1330           peer-ack-window option.)
1331
1332           This parameter may influence resync behavior on remote nodes. Peer
1333           nodes need to wait until they receive an peer-ack for releasing a
1334           lock on an AL-extent. Resync operations between peers may need to
1335           wait for for these locks.
1336
1337           The default value for peer-ack-delay is 100 milliseconds, the
1338           default unit is milliseconds. This option is available since 9.0.0.
1339
1340       quorum value
1341
1342           When activated, a cluster partition requires quorum in order to
1343           modify the replicated data set. That means a node in the cluster
1344           partition can only be promoted to primary if the cluster partition
1345           has quorum. Every node with a disk directly connected to the node
1346           that should be promoted counts. If a primary node should execute a
1347           write request, but the cluster partition has lost quorum, it will
1348           freeze IO or reject the write request with an error (depending on
1349           the on-no-quorum setting). Upon loosing quorum a primary always
1350           invokes the quorum-lost handler. The handler is intended for
1351           notification purposes, its return code is ignored.
1352
1353           The option's value might be set to off, majority, all or a numeric
1354           value. If you set it to a numeric value, make sure that the value
1355           is greater than half of your number of nodes. Quorum is a mechanism
1356           to avoid data divergence, it might be used instead of fencing when
1357           there are more than two repicas. It defaults to off
1358
1359           If all missing nodes are marked as outdated, a partition always has
1360           quorum, no matter how small it is. I.e. If you disconnect all
1361           secondary nodes gracefully a single primary continues to operate.
1362           In the moment a single secondary is lost, it has to be assumed that
1363           it forms a partition with all the missing outdated nodes. In case
1364           my partition might be smaller than the other, quorum is lost in
1365           this moment.
1366
1367           In case you want to allow permanently diskless nodes to gain quorum
1368           it is recommendet to not use majority or all. It is recommended to
1369           specify an absolute number, since DBRD's heuristic to determine the
1370           complete number of diskfull nodes in the cluster is unreliable.
1371
1372           The quorum implementation is available starting with the DRBD
1373           kernel driver version 9.0.7.
1374
1375       quorum-minimum-redundancy value
1376
1377           This option sets the minimal required number of nodes with an
1378           UpToDate disk to allow the partition to gain quorum. This is a
1379           different requirement than the plain quorum option expresses.
1380
1381           The option's value might be set to off, majority, all or a numeric
1382           value. If you set it to a numeric value, make sure that the value
1383           is greater than half of your number of nodes.
1384
1385           In case you want to allow permanently diskless nodes to gain quorum
1386           it is recommendet to not use majority or all. It is recommended to
1387           specify an absolute number, since DBRD's heuristic to determine the
1388           complete number of diskfull nodes in the cluster is unreliable.
1389
1390           This option is available starting with the DRBD kernel driver
1391           version 9.0.10.
1392
1393       on-no-quorum {io-error | suspend-io}
1394
1395           By default DRBD freezes IO on a device, that lost quorum. By
1396           setting the on-no-quorum to io-error it completes all IO operations
1397           with an error if quorum is lost.
1398
1399           Usually, the on-no-data-accessible should be set to the same value
1400           as on-no-quorum, as it has precedence.
1401
1402           The on-no-quorum options is available starting with the DRBD kernel
1403           driver version 9.0.8.
1404
1405       on-suspended-primary-outdated {disconnect | force-secondary}
1406
1407           This setting is only relevant when on-no-quorum is set to
1408           suspend-io. It is relevant in the following scenario. A primary
1409           node loses quorum hence has all IO requests frozen. This primary
1410           node then connects to another, quorate partition. It detects that a
1411           node in this quorate partition was promoted to primary, and started
1412           a newer data-generation there. As a result, the first primary
1413           learns that it has to consider itself outdated.
1414
1415           When it is set to force-secondary then it will demote to secondary
1416           immediately, and fail all pending (and new) IO requests with IO
1417           errors. It will refuse to allow any process to open the DRBD
1418           devices until all openers closed the device. This state is visible
1419           in status and events2 under the name force-io-failures.
1420
1421           The disconnect setting simply causes that node to reject connect
1422           attempts and stay isolated.
1423
1424           The on-suspended-primary-outdated option is available starting with
1425           the DRBD kernel driver version 9.1.7. It has a default value of
1426           disconnect.
1427
1428   Section startup Parameters
1429       The parameters in this section define the behavior of DRBD at system
1430       startup time, in the DRBD init script. They have no effect once the
1431       system is up and running.
1432
1433       degr-wfc-timeout timeout
1434
1435           Define how long to wait until all peers are connected in case the
1436           cluster consisted of a single node only when the system went down.
1437           This parameter is usually set to a value smaller than wfc-timeout.
1438           The assumption here is that peers which were unreachable before a
1439           reboot are less likely to be reachable after the reboot, so waiting
1440           is less likely to help.
1441
1442           The timeout is specified in seconds. The default value is 0, which
1443           stands for an infinite timeout. Also see the wfc-timeout parameter.
1444
1445       outdated-wfc-timeout timeout
1446
1447           Define how long to wait until all peers are connected if all peers
1448           were outdated when the system went down. This parameter is usually
1449           set to a value smaller than wfc-timeout. The assumption here is
1450           that an outdated peer cannot have become primary in the meantime,
1451           so we don't need to wait for it as long as for a node which was
1452           alive before.
1453
1454           The timeout is specified in seconds. The default value is 0, which
1455           stands for an infinite timeout. Also see the wfc-timeout parameter.
1456
1457       stacked-timeouts
1458           On stacked devices, the wfc-timeout and degr-wfc-timeout parameters
1459           in the configuration are usually ignored, and both timeouts are set
1460           to twice the connect-int timeout. The stacked-timeouts parameter
1461           tells DRBD to use the wfc-timeout and degr-wfc-timeout parameters
1462           as defined in the configuration, even on stacked devices. Only use
1463           this parameter if the peer of the stacked resource is usually not
1464           available, or will not become primary. Incorrect use of this
1465           parameter can lead to unexpected split-brain scenarios.
1466
1467       wait-after-sb
1468           This parameter causes DRBD to continue waiting in the init script
1469           even when a split-brain situation has been detected, and the nodes
1470           therefore refuse to connect to each other.
1471
1472       wfc-timeout timeout
1473
1474           Define how long the init script waits until all peers are
1475           connected. This can be useful in combination with a cluster manager
1476           which cannot manage DRBD resources: when the cluster manager
1477           starts, the DRBD resources will already be up and running. With a
1478           more capable cluster manager such as Pacemaker, it makes more sense
1479           to let the cluster manager control DRBD resources. The timeout is
1480           specified in seconds. The default value is 0, which stands for an
1481           infinite timeout. Also see the degr-wfc-timeout parameter.
1482
1483   Section volume Parameters
1484       device /dev/drbdminor-number
1485
1486           Define the device name and minor number of a replicated block
1487           device. This is the device that applications are supposed to
1488           access; in most cases, the device is not used directly, but as a
1489           file system. This parameter is required and the standard device
1490           naming convention is assumed.
1491
1492           In addition to this device, udev will create
1493           /dev/drbd/by-res/resource/volume and
1494           /dev/drbd/by-disk/lower-level-device symlinks to the device.
1495
1496       disk {[disk] | none}
1497
1498           Define the lower-level block device that DRBD will use for storing
1499           the actual data. While the replicated drbd device is configured,
1500           the lower-level device must not be used directly. Even read-only
1501           access with tools like dumpe2fs(8) and similar is not allowed. The
1502           keyword none specifies that no lower-level block device is
1503           configured; this also overrides inheritance of the lower-level
1504           device.
1505
1506       meta-disk internal,
1507       meta-disk device,
1508       meta-disk device [index]
1509
1510           Define where the metadata of a replicated block device resides: it
1511           can be internal, meaning that the lower-level device contains both
1512           the data and the metadata, or on a separate device.
1513
1514           When the index form of this parameter is used, multiple replicated
1515           devices can share the same metadata device, each using a separate
1516           index. Each index occupies 128 MiB of data, which corresponds to a
1517           replicated device size of at most 4 TiB with two cluster nodes. We
1518           recommend not to share metadata devices anymore, and to instead use
1519           the lvm volume manager for creating metadata devices as needed.
1520
1521           When the index form of this parameter is not used, the size of the
1522           lower-level device determines the size of the metadata. The size
1523           needed is 36 KiB + (size of lower-level device) / 32K * (number of
1524           nodes - 1). If the metadata device is bigger than that, the extra
1525           space is not used.
1526
1527           This parameter is required if a disk other than none is specified,
1528           and ignored if disk is set to none. A meta-disk parameter without a
1529           disk parameter is not allowed.
1530

NOTES ON DATA INTEGRITY

1532       DRBD supports two different mechanisms for data integrity checking:
1533       first, the data-integrity-alg network parameter allows to add a
1534       checksum to the data sent over the network. Second, the online
1535       verification mechanism (drbdadm verify and the verify-alg parameter)
1536       allows to check for differences in the on-disk data.
1537
1538       Both mechanisms can produce false positives if the data is modified
1539       during I/O (i.e., while it is being sent over the network or written to
1540       disk). This does not always indicate a problem: for example, some file
1541       systems and applications do modify data under I/O for certain
1542       operations. Swap space can also undergo changes while under I/O.
1543
1544       Network data integrity checking tries to identify data modification
1545       during I/O by verifying the checksums on the sender side after sending
1546       the data. If it detects a mismatch, it logs an error. The receiver also
1547       logs an error when it detects a mismatch. Thus, an error logged only on
1548       the receiver side indicates an error on the network, and an error
1549       logged on both sides indicates data modification under I/O.
1550
1551       The most recent example of systematic data corruption was identified as
1552       a bug in the TCP offloading engine and driver of a certain type of GBit
1553       NIC in 2007: the data corruption happened on the DMA transfer from core
1554       memory to the card. Because the TCP checksum were calculated on the
1555       card, the TCP/IP protocol checksums did not reveal this problem.
1556

VERSION

1558       This document was revised for version 9.0.0 of the DRBD distribution.
1559

AUTHOR

1561       Written by Philipp Reisner <philipp.reisner@linbit.com> and Lars
1562       Ellenberg <lars.ellenberg@linbit.com>.
1563

REPORTING BUGS

1565       Report bugs to <drbd-user@lists.linbit.com>.
1566

COPYRIGHT

1568       Copyright 2001-2018 LINBIT Information Technologies, Philipp Reisner,
1569       Lars Ellenberg. This is free software; see the source for copying
1570       conditions. There is NO warranty; not even for MERCHANTABILITY or
1571       FITNESS FOR A PARTICULAR PURPOSE.
1572

NOTES

1578        1. DRBD User's Guide
1579           http://www.drbd.org/users-guide/
1580
1581        2.
1582
1583                 Online Usage Counter
1584           http://usage.drbd.org
1585
1586        3. DRBD Web Site
1587           http://www.drbd.org/
1588
1589
1590
1591DRBD 9.0.x                      17 January 2018                   DRBD.CONF(5)