1DRBD.CONF(5)                  Configuration Files                 DRBD.CONF(5)
2
3
4

NAME

6       drbd.conf - DRBD Configuration Files
7

INTRODUCTION

9       DRBD implements block devices which replicate their data to all nodes
10       of a cluster. The actual data and associated metadata are usually
11       stored redundantly on "ordinary" block devices on each cluster node.
12
13       Replicated block devices are called /dev/drbdminor by default. They are
14       grouped into resources, with one or more devices per resource.
15       Replication among the devices in a resource takes place in
16       chronological order. With DRBD, we refer to the devices inside a
17       resource as volumes.
18
19       In DRBD 9, a resource can be replicated between two or more cluster
20       nodes. The connections between cluster nodes are point-to-point links,
21       and use TCP or a TCP-like protocol. All nodes must be directly
22       connected.
23
24       DRBD consists of low-level user-space components which interact with
25       the kernel and perform basic operations (drbdsetup, drbdmeta), a
26       high-level user-space component which understands and processes the
27       DRBD configuration and translates it into basic operations of the
28       low-level components (drbdadm), and a kernel component.
29
30       The default DRBD configuration consists of /etc/drbd.conf and of
31       additional files included from there, usually global_common.conf and
32       all *.res files inside /etc/drbd.d/. It has turned out to be useful to
33       define each resource in a separate *.res file.
34
35       The configuration files are designed so that each cluster node can
36       contain an identical copy of the entire cluster configuration. The host
37       name of each node determines which parts of the configuration apply
38       (uname -n). It is highly recommended to keep the cluster configuration
39       on all nodes in sync by manually copying it to all nodes, or by
40       automating the process with csync2 or a similar tool.
41

EXAMPLE CONFIGURATION FILE

43           global {
44                usage-count yes;
45                udev-always-use-vnr;
46           }
47           resource r0 {
48                 net {
49                      cram-hmac-alg sha1;
50                      shared-secret "FooFunFactory";
51                 }
52                 volume 0 {
53                      device    /dev/drbd1;
54                      disk      /dev/sda7;
55                      meta-disk internal;
56                 }
57                 on alice {
58                      node-id   0;
59                      address   10.1.1.31:7000;
60                 }
61                 on bob {
62                      node-id   1;
63                      address   10.1.1.32:7000;
64                 }
65                 connection {
66                      host      alice  port 7000;
67                      host      bob    port 7000;
68                      net {
69                          protocol C;
70                      }
71                 }
72           }
73
74       This example defines a resource r0 which contains a single replicated
75       device with volume number 0. The resource is replicated among hosts
76       alice and bob, which have the IPv4 addresses 10.1.1.31 and 10.1.1.32
77       and the node identifiers 0 and 1, respectively. On both hosts, the
78       replicated device is called /dev/drbd1, and the actual data and
79       metadata are stored on the lower-level device /dev/sda7. The connection
80       between the hosts uses protocol C.
81
82       Please refer to the DRBD User's Guide[1] for more examples.
83

FILE FORMAT

85       DRBD configuration files consist of sections, which contain other
86       sections and parameters depending on the section types. Each section
87       consists of one or more keywords, sometimes a section name, an opening
88       brace (“{”), the section's contents, and a closing brace (“}”).
89       Parameters inside a section consist of a keyword, followed by one or
90       more keywords or values, and a semicolon (“;”).
91
92       Some parameter values have a default scale which applies when a plain
93       number is specified (for example Kilo, or 1024 times the numeric
94       value). Such default scales can be overridden by using a suffix (for
95       example, M for Mega). The common suffixes K = 2^10 = 1024, M = 1024 K,
96       and G = 1024 M are supported.
97
98       Comments start with a hash sign (“#”) and extend to the end of the
99       line. In addition, any section can be prefixed with the keyword skip,
100       which causes the section and any sub-sections to be ignored.
101
102       Additional files can be included with the include file-pattern
103       statement (see glob(7) for the expressions supported in file-pattern).
104       Include statements are only allowed outside of sections.
105
106       The following sections are defined (indentation indicates in which
107       context):
108
109           common
110              [disk]
111              [handlers]
112              [net]
113              [options]
114              [startup]
115           global
116           [require-drbd-module-version-{eq,ne,gt,ge,lt,le}]
117           resource
118              connection
119                 multiple path | 2 host
120                 [net]
121                 [volume]
122                    [peer-device-options]
123                 [peer-device-options]
124              connection-mesh
125                 [net]
126              [disk]
127              floating
128              handlers
129              [net]
130              on
131                 volume
132                    disk
133                    [disk]
134              options
135              stacked-on-top-of
136              startup
137
138       Sections in brackets affect other parts of the configuration: inside
139       the common section, they apply to all resources. A disk section inside
140       a resource or on section applies to all volumes of that resource, and a
141       net section inside a resource section applies to all connections of
142       that resource. This allows to avoid repeating identical options for
143       each resource, connection, or volume. Options can be overridden in a
144       more specific resource, connection, on, or volume section.
145
146       peer-device-options are resync-rate, c-plan-ahead, c-delay-target,
147       c-fill-target, c-max-rate and c-min-rate. Due to backward
148       comapatibility they can be specified in any disk options section as
149       well. They are inherited into all relevant connections. If they are
150       given on connection level they are inherited to all volumes on that
151       connection. A peer-device-options section is started with the disk
152       keyword.
153
154   Sections
155       common
156
157           This section can contain each a disk, handlers, net, options, and
158           startup section. All resources inherit the parameters in these
159           sections as their default values.
160
161       connection [name]
162
163           Define a connection between two hosts. This section must contain
164           two host parameters or multiple path sections. The optional name is
165           used to refer to the connection in the system log and in other
166           messages. If no name is specified, the peer's host name is used
167           instead.
168
169       path
170
171           Define a path between two hosts. This section must contain two host
172           parameters.
173
174       connection-mesh
175
176           Define a connection mesh between multiple hosts. This section must
177           contain a hosts parameter, which has the host names as arguments.
178           This section is a shortcut to define many connections which share
179           the same network options.
180
181       disk
182
183           Define parameters for a volume. All parameters in this section are
184           optional.
185
186       floating [address-family] addr:port
187
188           Like the on section, except that instead of the host name a network
189           address is used to determine if it matches a floating section.
190
191           The node-id parameter in this section is required. If the address
192           parameter is not provided, no connections to peers will be created
193           by default. The device, disk, and meta-disk parameters must be
194           defined in, or inherited by, this section.
195
196       global
197
198           Define some global parameters. All parameters in this section are
199           optional. Only one global section is allowed in the configuration.
200
201       require-drbd-module-version-{eq,ne,gt,ge,lt,le}
202
203           This statement contains one of the valid forms and a three digit
204           version number (e.g., require-drbd-module-version-eq 9.0.16;). If
205           the currently loaded DRBD kernel module does not match the
206           specification, parsing is aborted. Comparison operator names have
207           same semantic as in test(1).
208
209       handlers
210
211           Define handlers to be invoked when certain events occur. The kernel
212           passes the resource name in the first command-line argument and
213           sets the following environment variables depending on the event's
214           context:
215
216           •   For events related to a particular device: the device's minor
217               number in DRBD_MINOR, the device's volume number in
218               DRBD_VOLUME.
219
220           •   For events related to a particular device on a particular peer:
221               the connection endpoints in DRBD_MY_ADDRESS, DRBD_MY_AF,
222               DRBD_PEER_ADDRESS, and DRBD_PEER_AF; the device's local minor
223               number in DRBD_MINOR, and the device's volume number in
224               DRBD_VOLUME.
225
226           •   For events related to a particular connection: the connection
227               endpoints in DRBD_MY_ADDRESS, DRBD_MY_AF, DRBD_PEER_ADDRESS,
228               and DRBD_PEER_AF; and, for each device defined for that
229               connection: the device's minor number in
230               DRBD_MINOR_volume-number.
231
232           •   For events that identify a device, if a lower-level device is
233               attached, the lower-level device's device name is passed in
234               DRBD_BACKING_DEV (or DRBD_BACKING_DEV_volume-number).
235
236           All parameters in this section are optional. Only a single handler
237           can be defined for each event; if no handler is defined, nothing
238           will happen.
239
240       net
241
242           Define parameters for a connection. All parameters in this section
243           are optional.
244
245       on host-name [...]
246
247           Define the properties of a resource on a particular host or set of
248           hosts. Specifying more than one host name can make sense in a setup
249           with IP address failover, for example. The host-name argument must
250           match the Linux host name (uname -n).
251
252           Usually contains or inherits at least one volume section. The
253           node-id and address parameters must be defined in this section. The
254           device, disk, and meta-disk parameters must be defined in, or
255           inherited by, this section.
256
257           A normal configuration file contains two or more on sections for
258           each resource. Also see the floating section.
259
260       options
261
262           Define parameters for a resource. All parameters in this section
263           are optional.
264
265       resource name
266
267           Define a resource. Usually contains at least two on sections and at
268           least one connection section.
269
270       stacked-on-top-of resource
271
272           Used instead of an on section for configuring a stacked resource
273           with three to four nodes.
274
275           Starting with DRBD 9, stacking is deprecated. It is advised to use
276           resources which are replicated among more than two nodes instead.
277
278       startup
279
280           The parameters in this section determine the behavior of a resource
281           at startup time.
282
283       volume volume-number
284
285           Define a volume within a resource. The volume numbers in the
286           various volume sections of a resource define which devices on which
287           hosts form a replicated device.
288
289   Section connection Parameters
290       host name [address [address-family] address] [port port-number]
291
292           Defines an endpoint for a connection. Each host statement refers to
293           an on section in a resource. If a port number is defined, this
294           endpoint will use the specified port instead of the port defined in
295           the on section. Each connection section must contain exactly two
296           host parameters. Instead of two host parameters the connection may
297           contain multiple path sections.
298
299   Section path Parameters
300       host name [address [address-family] address] [port port-number]
301
302           Defines an endpoint for a connection. Each host statement refers to
303           an on section in a resource. If a port number is defined, this
304           endpoint will use the specified port instead of the port defined in
305           the on section. Each path section must contain exactly two host
306           parameters.
307
308   Section connection-mesh Parameters
309       hosts name...
310
311           Defines all nodes of a mesh. Each name refers to an on section in a
312           resource. The port that is defined in the on section will be used.
313
314   Section disk Parameters
315       al-extents extents
316
317           DRBD automatically maintains a "hot" or "active" disk area likely
318           to be written to again soon based on the recent write activity. The
319           "active" disk area can be written to immediately, while "inactive"
320           disk areas must be "activated" first, which requires a meta-data
321           write. We also refer to this active disk area as the "activity
322           log".
323
324           The activity log saves meta-data writes, but the whole log must be
325           resynced upon recovery of a failed node. The size of the activity
326           log is a major factor of how long a resync will take and how fast a
327           replicated disk will become consistent after a crash.
328
329           The activity log consists of a number of 4-Megabyte segments; the
330           al-extents parameter determines how many of those segments can be
331           active at the same time. The default value for al-extents is 1237,
332           with a minimum of 7 and a maximum of 65536.
333
334           Note that the effective maximum may be smaller, depending on how
335           you created the device meta data, see also drbdmeta(8) The
336           effective maximum is 919 * (available on-disk activity-log
337           ring-buffer area/4kB -1), the default 32kB ring-buffer effects a
338           maximum of 6433 (covers more than 25 GiB of data) We recommend to
339           keep this well within the amount your backend storage and
340           replication link are able to resync inside of about 5 minutes.
341
342       al-updates {yes | no}
343
344           With this parameter, the activity log can be turned off entirely
345           (see the al-extents parameter). This will speed up writes because
346           fewer meta-data writes will be necessary, but the entire device
347           needs to be resynchronized opon recovery of a failed primary node.
348           The default value for al-updates is yes.
349
350       disk-barrier,
351       disk-flushes,
352       disk-drain
353           DRBD has three methods of handling the ordering of dependent write
354           requests:
355
356           disk-barrier
357               Use disk barriers to make sure that requests are written to
358               disk in the right order. Barriers ensure that all requests
359               submitted before a barrier make it to the disk before any
360               requests submitted after the barrier. This is implemented using
361               'tagged command queuing' on SCSI devices and 'native command
362               queuing' on SATA devices. Only some devices and device stacks
363               support this method. The device mapper (LVM) only supports
364               barriers in some configurations.
365
366               Note that on systems which do not support disk barriers,
367               enabling this option can lead to data loss or corruption. Until
368               DRBD 8.4.1, disk-barrier was turned on if the I/O stack below
369               DRBD did support barriers. Kernels since linux-2.6.36 (or
370               2.6.32 RHEL6) no longer allow to detect if barriers are
371               supported. Since drbd-8.4.2, this option is off by default and
372               needs to be enabled explicitly.
373
374           disk-flushes
375               Use disk flushes between dependent write requests, also
376               referred to as 'force unit access' by drive vendors. This
377               forces all data to disk. This option is enabled by default.
378
379           disk-drain
380               Wait for the request queue to "drain" (that is, wait for the
381               requests to finish) before submitting a dependent write
382               request. This method requires that requests are stable on disk
383               when they finish. Before DRBD 8.0.9, this was the only method
384               implemented. This option is enabled by default. Do not disable
385               in production environments.
386
387           From these three methods, drbd will use the first that is enabled
388           and supported by the backing storage device. If all three of these
389           options are turned off, DRBD will submit write requests without
390           bothering about dependencies. Depending on the I/O stack, write
391           requests can be reordered, and they can be submitted in a different
392           order on different cluster nodes. This can result in data loss or
393           corruption. Therefore, turning off all three methods of controlling
394           write ordering is strongly discouraged.
395
396           A general guideline for configuring write ordering is to use disk
397           barriers or disk flushes when using ordinary disks (or an ordinary
398           disk array) with a volatile write cache. On storage without cache
399           or with a battery backed write cache, disk draining can be a
400           reasonable choice.
401
402       disk-timeout
403           If the lower-level device on which a DRBD device stores its data
404           does not finish an I/O request within the defined disk-timeout,
405           DRBD treats this as a failure. The lower-level device is detached,
406           and the device's disk state advances to Diskless. If DRBD is
407           connected to one or more peers, the failed request is passed on to
408           one of them.
409
410           This option is dangerous and may lead to kernel panic!
411
412           "Aborting" requests, or force-detaching the disk, is intended for
413           completely blocked/hung local backing devices which do no longer
414           complete requests at all, not even do error completions. In this
415           situation, usually a hard-reset and failover is the only way out.
416
417           By "aborting", basically faking a local error-completion, we allow
418           for a more graceful swichover by cleanly migrating services. Still
419           the affected node has to be rebooted "soon".
420
421           By completing these requests, we allow the upper layers to re-use
422           the associated data pages.
423
424           If later the local backing device "recovers", and now DMAs some
425           data from disk into the original request pages, in the best case it
426           will just put random data into unused pages; but typically it will
427           corrupt meanwhile completely unrelated data, causing all sorts of
428           damage.
429
430           Which means delayed successful completion, especially for READ
431           requests, is a reason to panic(). We assume that a delayed *error*
432           completion is OK, though we still will complain noisily about it.
433
434           The default value of disk-timeout is 0, which stands for an
435           infinite timeout. Timeouts are specified in units of 0.1 seconds.
436           This option is available since DRBD 8.3.12.
437
438       md-flushes
439           Enable disk flushes and disk barriers on the meta-data device. This
440           option is enabled by default. See the disk-flushes parameter.
441
442       on-io-error handler
443
444           Configure how DRBD reacts to I/O errors on a lower-level device.
445           The following policies are defined:
446
447           pass_on
448               Change the disk status to Inconsistent, mark the failed block
449               as inconsistent in the bitmap, and retry the I/O operation on a
450               remote cluster node.
451
452           call-local-io-error
453               Call the local-io-error handler (see the handlers section).
454
455           detach
456               Detach the lower-level device and continue in diskless mode.
457
458
459       read-balancing policy
460           Distribute read requests among cluster nodes as defined by policy.
461           The supported policies are prefer-local (the default),
462           prefer-remote, round-robin, least-pending, when-congested-remote,
463           32K-striping, 64K-striping, 128K-striping, 256K-striping,
464           512K-striping and 1M-striping.
465
466           This option is available since DRBD 8.4.1.
467
468       resync-after res-name/volume
469
470           Define that a device should only resynchronize after the specified
471           other device. By default, no order between devices is defined, and
472           all devices will resynchronize in parallel. Depending on the
473           configuration of the lower-level devices, and the available network
474           and disk bandwidth, this can slow down the overall resync process.
475           This option can be used to form a chain or tree of dependencies
476           among devices.
477
478       rs-discard-granularity byte
479           When rs-discard-granularity is set to a non zero, positive value
480           then DRBD tries to do a resync operation in requests of this size.
481           In case such a block contains only zero bytes on the sync source
482           node, the sync target node will issue a discard/trim/unmap command
483           for the area.
484
485           The value is constrained by the discard granularity of the backing
486           block device. In case rs-discard-granularity is not a multiplier of
487           the discard granularity of the backing block device DRBD rounds it
488           up. The feature only gets active if the backing block device reads
489           back zeroes after a discard command.
490
491           The default value of rs-discard-granularity is 0. This option is
492           available since 8.4.7.
493
494       discard-zeroes-if-aligned {yes | no}
495
496           There are several aspects to discard/trim/unmap support on linux
497           block devices. Even if discard is supported in general, it may fail
498           silently, or may partially ignore discard requests. Devices also
499           announce whether reading from unmapped blocks returns defined data
500           (usually zeroes), or undefined data (possibly old data, possibly
501           garbage).
502
503           If on different nodes, DRBD is backed by devices with differing
504           discard characteristics, discards may lead to data divergence (old
505           data or garbage left over on one backend, zeroes due to unmapped
506           areas on the other backend). Online verify would now potentially
507           report tons of spurious differences. While probably harmless for
508           most use cases (fstrim on a file system), DRBD cannot have that.
509
510           To play safe, we have to disable discard support, if our local
511           backend (on a Primary) does not support "discard_zeroes_data=true".
512           We also have to translate discards to explicit zero-out on the
513           receiving side, unless the receiving side (Secondary) supports
514           "discard_zeroes_data=true", thereby allocating areas what were
515           supposed to be unmapped.
516
517           There are some devices (notably the LVM/DM thin provisioning) that
518           are capable of discard, but announce discard_zeroes_data=false. In
519           the case of DM-thin, discards aligned to the chunk size will be
520           unmapped, and reading from unmapped sectors will return zeroes.
521           However, unaligned partial head or tail areas of discard requests
522           will be silently ignored.
523
524           If we now add a helper to explicitly zero-out these unaligned
525           partial areas, while passing on the discard of the aligned full
526           chunks, we effectively achieve discard_zeroes_data=true on such
527           devices.
528
529           Setting discard-zeroes-if-aligned to yes will allow DRBD to use
530           discards, and to announce discard_zeroes_data=true, even on
531           backends that announce discard_zeroes_data=false.
532
533           Setting discard-zeroes-if-aligned to no will cause DRBD to always
534           fall-back to zero-out on the receiving side, and to not even
535           announce discard capabilities on the Primary, if the respective
536           backend announces discard_zeroes_data=false.
537
538           We used to ignore the discard_zeroes_data setting completely. To
539           not break established and expected behaviour, and suddenly cause
540           fstrim on thin-provisioned LVs to run out-of-space instead of
541           freeing up space, the default value is yes.
542
543           This option is available since 8.4.7.
544
545       disable-write-same {yes | no}
546
547           Some disks announce WRITE_SAME support to the kernel but fail with
548           an I/O error upon actually receiving such a request. This mostly
549           happens when using virtualized disks -- notably, this behavior has
550           been observed with VMware's virtual disks.
551
552           When disable-write-same is set to yes, WRITE_SAME detection is
553           manually overriden and support is disabled.
554
555           The default value of disable-write-same is no. This option is
556           available since 8.4.7.
557
558   Section peer-device-options Parameters
559       Please note that you open the section with the disk keyword.
560
561       c-delay-target delay_target,
562       c-fill-target fill_target,
563       c-max-rate max_rate,
564       c-plan-ahead plan_time
565           Dynamically control the resync speed. The following modes are
566           available:
567
568           •   Dynamic control with fill target (default). Enabled when
569               c-plan-ahead is non-zero and c-fill-target is non-zero. The
570               goal is to fill the buffers along the data path with a defined
571               amount of data. This mode is recommended when DRBD-proxy is
572               used. Configured with c-plan-ahead, c-fill-target and
573               c-max-rate.
574
575           •   Dynamic control with delay target. Enabled when c-plan-ahead is
576               non-zero (default) and c-fill-target is zero. The goal is to
577               have a defined delay along the path. Configured with
578               c-plan-ahead, c-delay-target and c-max-rate.
579
580           •   Fixed resync rate. Enabled when c-plan-ahead is zero. DRBD will
581               try to perform resync I/O at a fixed rate. Configured with
582               resync-rate.
583
584           The c-plan-ahead parameter defines how fast DRBD adapts to changes
585           in the resync speed. It should be set to five times the network
586           round-trip time or more. The default value of c-plan-ahead is 20,
587           in units of 0.1 seconds.
588
589           The c-fill-target parameter defines the how much resync data DRBD
590           should aim to have in-flight at all times. Common values for
591           "normal" data paths range from 4K to 100K. The default value of
592           c-fill-target is 100, in units of sectors
593
594           The c-delay-target parameter defines the delay in the resync path
595           that DRBD should aim for. This should be set to five times the
596           network round-trip time or more. The default value of
597           c-delay-target is 10, in units of 0.1 seconds.
598
599           The c-max-rate parameter limits the maximum bandwidth used by
600           dynamically controlled resyncs. Setting this to zero removes the
601           limitation (since DRBD 9.0.28). It should be set to either the
602           bandwidth available between the DRBD hosts and the machines hosting
603           DRBD-proxy, or to the available disk bandwidth. The default value
604           of c-max-rate is 102400, in units of KiB/s.
605
606           Dynamic resync speed control is available since DRBD 8.3.9.
607
608       c-min-rate min_rate
609           A node which is primary and sync-source has to schedule application
610           I/O requests and resync I/O requests. The c-min-rate parameter
611           limits how much bandwidth is available for resync I/O; the
612           remaining bandwidth is used for application I/O.
613
614           A c-min-rate value of 0 means that there is no limit on the resync
615           I/O bandwidth. This can slow down application I/O significantly.
616           Use a value of 1 (1 KiB/s) for the lowest possible resync rate.
617
618           The default value of c-min-rate is 250, in units of KiB/s.
619
620       resync-rate rate
621
622           Define how much bandwidth DRBD may use for resynchronizing. DRBD
623           allows "normal" application I/O even during a resync. If the resync
624           takes up too much bandwidth, application I/O can become very slow.
625           This parameter allows to avoid that. Please note this is option
626           only works when the dynamic resync controller is disabled.
627
628   Section global Parameters
629       dialog-refresh time
630
631           The DRBD init script can be used to configure and start DRBD
632           devices, which can involve waiting for other cluster nodes. While
633           waiting, the init script shows the remaining waiting time. The
634           dialog-refresh defines the number of seconds between updates of
635           that countdown. The default value is 1; a value of 0 turns off the
636           countdown.
637
638       disable-ip-verification
639           Normally, DRBD verifies that the IP addresses in the configuration
640           match the host names. Use the disable-ip-verification parameter to
641           disable these checks.
642
643       usage-count {yes | no | ask}
644           A explained on DRBD's Online Usage Counter[2] web page, DRBD
645           includes a mechanism for anonymously counting how many
646           installations are using which versions of DRBD. The results are
647           available on the web page for anyone to see.
648
649           This parameter defines if a cluster node participates in the usage
650           counter; the supported values are yes, no, and ask (ask the user,
651           the default).
652
653           We would like to ask users to participate in the online usage
654           counter as this provides us valuable feedback for steering the
655           development of DRBD.
656
657       udev-always-use-vnr
658           When udev asks drbdadm for a list of device related symlinks,
659           drbdadm would suggest symlinks with differing naming conventions,
660           depending on whether the resource has explicit volume VNR { }
661           definitions, or only one single volume with the implicit volume
662           number 0:
663
664               # implicit single volume without "volume 0 {}" block
665               DEVICE=drbd<minor>
666               SYMLINK_BY_RES=drbd/by-res/<resource-name>
667               SYMLINK_BY_DISK=drbd/by-disk/<backing-disk-name>
668
669               # explicit volume definition: volume VNR { }
670               DEVICE=drbd<minor>
671               SYMLINK_BY_RES=drbd/by-res/<resource-name>/VNR
672               SYMLINK_BY_DISK=drbd/by-disk/<backing-disk-name>
673
674           If you define this parameter in the global section, drbdadm will
675           always add the .../VNR part, and will not care for whether the
676           volume definition was implicit or explicit.
677
678           For legacy backward compatibility, this is off by default, but we
679           do recommend to enable it.
680
681   Section handlers Parameters
682       after-resync-target cmd
683
684           Called on a resync target when a node state changes from
685           Inconsistent to Consistent when a resync finishes. This handler can
686           be used for removing the snapshot created in the
687           before-resync-target handler.
688
689       before-resync-target cmd
690
691           Called on a resync target before a resync begins. This handler can
692           be used for creating a snapshot of the lower-level device for the
693           duration of the resync: if the resync source becomes unavailable
694           during a resync, reverting to the snapshot can restore a consistent
695           state.
696
697       before-resync-source cmd
698
699           Called on a resync source before a resync begins.
700
701       out-of-sync cmd
702
703           Called on all nodes after a verify finishes and out-of-sync blocks
704           were found. This handler is mainly used for monitoring purposes. An
705           example would be to call a script that sends an alert SMS.
706
707       quorum-lost cmd
708
709           Called on a Primary that lost quorum. This handler is usually used
710           to reboot the node if it is not possible to restart the application
711           that uses the storage on top of DRBD.
712
713       fence-peer cmd
714
715           Called when a node should fence a resource on a particular peer.
716           The handler should not use the same communication path that DRBD
717           uses for talking to the peer.
718
719       unfence-peer cmd
720
721           Called when a node should remove fencing constraints from other
722           nodes.
723
724       initial-split-brain cmd
725
726           Called when DRBD connects to a peer and detects that the peer is in
727           a split-brain state with the local node. This handler is also
728           called for split-brain scenarios which will be resolved
729           automatically.
730
731       local-io-error cmd
732
733           Called when an I/O error occurs on a lower-level device.
734
735       pri-lost cmd
736
737           The local node is currently primary, but DRBD believes that it
738           should become a sync target. The node should give up its primary
739           role.
740
741       pri-lost-after-sb cmd
742
743           The local node is currently primary, but it has lost the
744           after-split-brain auto recovery procedure. The node should be
745           abandoned.
746
747       pri-on-incon-degr cmd
748
749           The local node is primary, and neither the local lower-level device
750           nor a lower-level device on a peer is up to date. (The primary has
751           no device to read from or to write to.)
752
753       split-brain cmd
754
755           DRBD has detected a split-brain situation which could not be
756           resolved automatically. Manual recovery is necessary. This handler
757           can be used to call for administrator attention.
758
759       disconnected cmd
760
761           A connection to a peer went down. The handler can learn about the
762           reason for the disconnect from the DRBD_CSTATE environment
763           variable.
764
765   Section net Parameters
766       after-sb-0pri policy
767           Define how to react if a split-brain scenario is detected and none
768           of the two nodes is in primary role. (We detect split-brain
769           scenarios when two nodes connect; split-brain decisions are always
770           between two nodes.) The defined policies are:
771
772           disconnect
773               No automatic resynchronization; simply disconnect.
774
775           discard-younger-primary,
776           discard-older-primary
777               Resynchronize from the node which became primary first
778               (discard-younger-primary) or last (discard-older-primary). If
779               both nodes became primary independently, the
780               discard-least-changes policy is used.
781
782           discard-zero-changes
783               If only one of the nodes wrote data since the split brain
784               situation was detected, resynchronize from this node to the
785               other. If both nodes wrote data, disconnect.
786
787           discard-least-changes
788               Resynchronize from the node with more modified blocks.
789
790           discard-node-nodename
791               Always resynchronize to the named node.
792
793       after-sb-1pri policy
794           Define how to react if a split-brain scenario is detected, with one
795           node in primary role and one node in secondary role. (We detect
796           split-brain scenarios when two nodes connect, so split-brain
797           decisions are always among two nodes.) The defined policies are:
798
799           disconnect
800               No automatic resynchronization, simply disconnect.
801
802           consensus
803               Discard the data on the secondary node if the after-sb-0pri
804               algorithm would also discard the data on the secondary node.
805               Otherwise, disconnect.
806
807           violently-as0p
808               Always take the decision of the after-sb-0pri algorithm, even
809               if it causes an erratic change of the primary's view of the
810               data. This is only useful if a single-node file system (i.e.,
811               not OCFS2 or GFS) with the allow-two-primaries flag is used.
812               This option can cause the primary node to crash, and should not
813               be used.
814
815           discard-secondary
816               Discard the data on the secondary node.
817
818           call-pri-lost-after-sb
819               Always take the decision of the after-sb-0pri algorithm. If the
820               decision is to discard the data on the primary node, call the
821               pri-lost-after-sb handler on the primary node.
822
823       after-sb-2pri policy
824           Define how to react if a split-brain scenario is detected and both
825           nodes are in primary role. (We detect split-brain scenarios when
826           two nodes connect, so split-brain decisions are always among two
827           nodes.) The defined policies are:
828
829           disconnect
830               No automatic resynchronization, simply disconnect.
831
832           violently-as0p
833               See the violently-as0p policy for after-sb-1pri.
834
835           call-pri-lost-after-sb
836               Call the pri-lost-after-sb helper program on one of the
837               machines unless that machine can demote to secondary. The
838               helper program is expected to reboot the machine, which brings
839               the node into a secondary role. Which machine runs the helper
840               program is determined by the after-sb-0pri strategy.
841
842       allow-two-primaries
843
844           The most common way to configure DRBD devices is to allow only one
845           node to be primary (and thus writable) at a time.
846
847           In some scenarios it is preferable to allow two nodes to be primary
848           at once; a mechanism outside of DRBD then must make sure that
849           writes to the shared, replicated device happen in a coordinated
850           way. This can be done with a shared-storage cluster file system
851           like OCFS2 and GFS, or with virtual machine images and a virtual
852           machine manager that can migrate virtual machines between physical
853           machines.
854
855           The allow-two-primaries parameter tells DRBD to allow two nodes to
856           be primary at the same time. Never enable this option when using a
857           non-distributed file system; otherwise, data corruption and node
858           crashes will result!
859
860       always-asbp
861           Normally the automatic after-split-brain policies are only used if
862           current states of the UUIDs do not indicate the presence of a third
863           node.
864
865           With this option you request that the automatic after-split-brain
866           policies are used as long as the data sets of the nodes are somehow
867           related. This might cause a full sync, if the UUIDs indicate the
868           presence of a third node. (Or double faults led to strange UUID
869           sets.)
870
871       connect-int time
872
873           As soon as a connection between two nodes is configured with
874           drbdsetup connect, DRBD immediately tries to establish the
875           connection. If this fails, DRBD waits for connect-int seconds and
876           then repeats. The default value of connect-int is 10 seconds.
877
878       cram-hmac-alg hash-algorithm
879
880           Configure the hash-based message authentication code (HMAC) or
881           secure hash algorithm to use for peer authentication. The kernel
882           supports a number of different algorithms, some of which may be
883           loadable as kernel modules. See the shash algorithms listed in
884           /proc/crypto. By default, cram-hmac-alg is unset. Peer
885           authentication also requires a shared-secret to be configured.
886
887       csums-alg hash-algorithm
888
889           Normally, when two nodes resynchronize, the sync target requests a
890           piece of out-of-sync data from the sync source, and the sync source
891           sends the data. With many usage patterns, a significant number of
892           those blocks will actually be identical.
893
894           When a csums-alg algorithm is specified, when requesting a piece of
895           out-of-sync data, the sync target also sends along a hash of the
896           data it currently has. The sync source compares this hash with its
897           own version of the data. It sends the sync target the new data if
898           the hashes differ, and tells it that the data are the same
899           otherwise. This reduces the network bandwidth required, at the cost
900           of higher cpu utilization and possibly increased I/O on the sync
901           target.
902
903           The csums-alg can be set to one of the secure hash algorithms
904           supported by the kernel; see the shash algorithms listed in
905           /proc/crypto. By default, csums-alg is unset.
906
907       csums-after-crash-only
908
909           Enabling this option (and csums-alg, above) makes it possible to
910           use the checksum based resync only for the first resync after
911           primary crash, but not for later "network hickups".
912
913           In most cases, block that are marked as need-to-be-resynced are in
914           fact changed, so calculating checksums, and both reading and
915           writing the blocks on the resync target is all effective overhead.
916
917           The advantage of checksum based resync is mostly after primary
918           crash recovery, where the recovery marked larger areas (those
919           covered by the activity log) as need-to-be-resynced, just in case.
920           Introduced in 8.4.5.
921
922       data-integrity-alg  alg
923           DRBD normally relies on the data integrity checks built into the
924           TCP/IP protocol, but if a data integrity algorithm is configured,
925           it will additionally use this algorithm to make sure that the data
926           received over the network match what the sender has sent. If a data
927           integrity error is detected, DRBD will close the network connection
928           and reconnect, which will trigger a resync.
929
930           The data-integrity-alg can be set to one of the secure hash
931           algorithms supported by the kernel; see the shash algorithms listed
932           in /proc/crypto. By default, this mechanism is turned off.
933
934           Because of the CPU overhead involved, we recommend not to use this
935           option in production environments. Also see the notes on data
936           integrity below.
937
938       fencing fencing_policy
939
940           Fencing is a preventive measure to avoid situations where both
941           nodes are primary and disconnected. This is also known as a
942           split-brain situation. DRBD supports the following fencing
943           policies:
944
945           dont-care
946               No fencing actions are taken. This is the default policy.
947
948           resource-only
949               If a node becomes a disconnected primary, it tries to fence the
950               peer. This is done by calling the fence-peer handler. The
951               handler is supposed to reach the peer over an alternative
952               communication path and call 'drbdadm outdate minor' there.
953
954           resource-and-stonith
955               If a node becomes a disconnected primary, it freezes all its IO
956               operations and calls its fence-peer handler. The fence-peer
957               handler is supposed to reach the peer over an alternative
958               communication path and call 'drbdadm outdate minor' there. In
959               case it cannot do that, it should stonith the peer. IO is
960               resumed as soon as the situation is resolved. In case the
961               fence-peer handler fails, I/O can be resumed manually with
962               'drbdadm resume-io'.
963
964       ko-count number
965
966           If a secondary node fails to complete a write request in ko-count
967           times the timeout parameter, it is excluded from the cluster. The
968           primary node then sets the connection to this secondary node to
969           Standalone. To disable this feature, you should explicitly set it
970           to 0; defaults may change between versions.
971
972       max-buffers number
973
974           Limits the memory usage per DRBD minor device on the receiving
975           side, or for internal buffers during resync or online-verify. Unit
976           is PAGE_SIZE, which is 4 KiB on most systems. The minimum possible
977           setting is hard coded to 32 (=128 KiB). These buffers are used to
978           hold data blocks while they are written to/read from disk. To avoid
979           possible distributed deadlocks on congestion, this setting is used
980           as a throttle threshold rather than a hard limit. Once more than
981           max-buffers pages are in use, further allocation from this pool is
982           throttled. You want to increase max-buffers if you cannot saturate
983           the IO backend on the receiving side.
984
985       max-epoch-size number
986
987           Define the maximum number of write requests DRBD may issue before
988           issuing a write barrier. The default value is 2048, with a minimum
989           of 1 and a maximum of 20000. Setting this parameter to a value
990           below 10 is likely to decrease performance.
991
992       on-congestion policy,
993       congestion-fill threshold,
994       congestion-extents threshold
995           By default, DRBD blocks when the TCP send queue is full. This
996           prevents applications from generating further write requests until
997           more buffer space becomes available again.
998
999           When DRBD is used together with DRBD-proxy, it can be better to use
1000           the pull-ahead on-congestion policy, which can switch DRBD into
1001           ahead/behind mode before the send queue is full. DRBD then records
1002           the differences between itself and the peer in its bitmap, but it
1003           no longer replicates them to the peer. When enough buffer space
1004           becomes available again, the node resynchronizes with the peer and
1005           switches back to normal replication.
1006
1007           This has the advantage of not blocking application I/O even when
1008           the queues fill up, and the disadvantage that peer nodes can fall
1009           behind much further. Also, while resynchronizing, peer nodes will
1010           become inconsistent.
1011
1012           The available congestion policies are block (the default) and
1013           pull-ahead. The congestion-fill parameter defines how much data is
1014           allowed to be "in flight" in this connection. The default value is
1015           0, which disables this mechanism of congestion control, with a
1016           maximum of 10 GiBytes. The congestion-extents parameter defines how
1017           many bitmap extents may be active before switching into
1018           ahead/behind mode, with the same default and limits as the
1019           al-extents parameter. The congestion-extents parameter is effective
1020           only when set to a value smaller than al-extents.
1021
1022           Ahead/behind mode is available since DRBD 8.3.10.
1023
1024       ping-int interval
1025
1026           When the TCP/IP connection to a peer is idle for more than ping-int
1027           seconds, DRBD will send a keep-alive packet to make sure that a
1028           failed peer or network connection is detected reasonably soon. The
1029           default value is 10 seconds, with a minimum of 1 and a maximum of
1030           120 seconds. The unit is seconds.
1031
1032       ping-timeout timeout
1033
1034           Define the timeout for replies to keep-alive packets. If the peer
1035           does not reply within ping-timeout, DRBD will close and try to
1036           reestablish the connection. The default value is 0.5 seconds, with
1037           a minimum of 0.1 seconds and a maximum of 3 seconds. The unit is
1038           tenths of a second.
1039
1040       socket-check-timeout timeout
1041           In setups involving a DRBD-proxy and connections that experience a
1042           lot of buffer-bloat it might be necessary to set ping-timeout to an
1043           unusual high value. By default DRBD uses the same value to wait if
1044           a newly established TCP-connection is stable. Since the DRBD-proxy
1045           is usually located in the same data center such a long wait time
1046           may hinder DRBD's connect process.
1047
1048           In such setups socket-check-timeout should be set to at least to
1049           the round trip time between DRBD and DRBD-proxy. I.e. in most cases
1050           to 1.
1051
1052           The default unit is tenths of a second, the default value is 0
1053           (which causes DRBD to use the value of ping-timeout instead).
1054           Introduced in 8.4.5.
1055
1056       protocol name
1057           Use the specified protocol on this connection. The supported
1058           protocols are:
1059
1060           A
1061               Writes to the DRBD device complete as soon as they have reached
1062               the local disk and the TCP/IP send buffer.
1063
1064           B
1065               Writes to the DRBD device complete as soon as they have reached
1066               the local disk, and all peers have acknowledged the receipt of
1067               the write requests.
1068
1069           C
1070               Writes to the DRBD device complete as soon as they have reached
1071               the local and all remote disks.
1072
1073
1074       rcvbuf-size size
1075
1076           Configure the size of the TCP/IP receive buffer. A value of 0 (the
1077           default) causes the buffer size to adjust dynamically. This
1078           parameter usually does not need to be set, but it can be set to a
1079           value up to 10 MiB. The default unit is bytes.
1080
1081       rr-conflict policy
1082           This option helps to solve the cases when the outcome of the resync
1083           decision is incompatible with the current role assignment in the
1084           cluster. The defined policies are:
1085
1086           disconnect
1087               No automatic resynchronization, simply disconnect.
1088
1089           retry-connect
1090               Disconnect now, and retry to connect immediatly afterwards.
1091
1092           violently
1093               Resync to the primary node is allowed, violating the assumption
1094               that data on a block device are stable for one of the nodes.
1095               Do not use this option, it is dangerous.
1096
1097           call-pri-lost
1098               Call the pri-lost handler on one of the machines. The handler
1099               is expected to reboot the machine, which puts it into secondary
1100               role.
1101
1102       shared-secret secret
1103
1104           Configure the shared secret used for peer authentication. The
1105           secret is a string of up to 64 characters. Peer authentication also
1106           requires the cram-hmac-alg parameter to be set.
1107
1108       sndbuf-size size
1109
1110           Configure the size of the TCP/IP send buffer. Since DRBD 8.0.13 /
1111           8.2.7, a value of 0 (the default) causes the buffer size to adjust
1112           dynamically. Values below 32 KiB are harmful to the throughput on
1113           this connection. Large buffer sizes can be useful especially when
1114           protocol A is used over high-latency networks; the maximum value
1115           supported is 10 MiB.
1116
1117       tcp-cork
1118           By default, DRBD uses the TCP_CORK socket option to prevent the
1119           kernel from sending partial messages; this results in fewer and
1120           bigger packets on the network. Some network stacks can perform
1121           worse with this optimization. On these, the tcp-cork parameter can
1122           be used to turn this optimization off.
1123
1124       timeout time
1125
1126           Define the timeout for replies over the network: if a peer node
1127           does not send an expected reply within the specified timeout, it is
1128           considered dead and the TCP/IP connection is closed. The timeout
1129           value must be lower than connect-int and lower than ping-int. The
1130           default is 6 seconds; the value is specified in tenths of a second.
1131
1132       transport type
1133
1134           With DRBD9 the network transport used by DRBD is loaded as a
1135           seperate module. With this option you can specify which transport
1136           and module to load. At present only two options exist, tcp and
1137           rdma. Please note that currently the RDMA transport module is only
1138           available with a license purchased from LINBIT. Default is tcp.
1139
1140       use-rle
1141
1142           Each replicated device on a cluster node has a separate bitmap for
1143           each of its peer devices. The bitmaps are used for tracking the
1144           differences between the local and peer device: depending on the
1145           cluster state, a disk range can be marked as different from the
1146           peer in the device's bitmap, in the peer device's bitmap, or in
1147           both bitmaps. When two cluster nodes connect, they exchange each
1148           other's bitmaps, and they each compute the union of the local and
1149           peer bitmap to determine the overall differences.
1150
1151           Bitmaps of very large devices are also relatively large, but they
1152           usually compress very well using run-length encoding. This can save
1153           time and bandwidth for the bitmap transfers.
1154
1155           The use-rle parameter determines if run-length encoding should be
1156           used. It is on by default since DRBD 8.4.0.
1157
1158       verify-alg hash-algorithm
1159           Online verification (drbdadm verify) computes and compares
1160           checksums of disk blocks (i.e., hash values) in order to detect if
1161           they differ. The verify-alg parameter determines which algorithm to
1162           use for these checksums. It must be set to one of the secure hash
1163           algorithms supported by the kernel before online verify can be
1164           used; see the shash algorithms listed in /proc/crypto.
1165
1166           We recommend to schedule online verifications regularly during
1167           low-load periods, for example once a month. Also see the notes on
1168           data integrity below.
1169
1170       allow-remote-read bool-value
1171           Allows or disallows DRBD to read from a peer node.
1172
1173           When the disk of a primary node is detached, DRBD will try to
1174           continue reading and writing from another node in the cluster. For
1175           this purpose, it searches for nodes with up-to-date data, and uses
1176           any found node to resume operations. In some cases it may not be
1177           desirable to read back data from a peer node, because the node
1178           should only be used as a replication target. In this case, the
1179           allow-remote-read parameter can be set to no, which would prohibit
1180           this node from reading data from the peer node.
1181
1182           The allow-remote-read parameter is available since DRBD 9.0.19, and
1183           defaults to yes.
1184
1185   Section on Parameters
1186       address [address-family] address:port
1187
1188           Defines the address family, address, and port of a connection
1189           endpoint.
1190
1191           The address families ipv4, ipv6, ssocks (Dolphin Interconnect
1192           Solutions' "super sockets"), sdp (Infiniband Sockets Direct
1193           Protocol), and sci are supported (sci is an alias for ssocks). If
1194           no address family is specified, ipv4 is assumed. For all address
1195           families except ipv6, the address is specified in IPV4 address
1196           notation (for example, 1.2.3.4). For ipv6, the address is enclosed
1197           in brackets and uses IPv6 address notation (for example,
1198           [fd01:2345:6789:abcd::1]). The port is always specified as a
1199           decimal number from 1 to 65535.
1200
1201           On each host, the port numbers must be unique for each address;
1202           ports cannot be shared.
1203
1204       node-id value
1205
1206           Defines the unique node identifier for a node in the cluster. Node
1207           identifiers are used to identify individual nodes in the network
1208           protocol, and to assign bitmap slots to nodes in the metadata.
1209
1210           Node identifiers can only be reasssigned in a cluster when the
1211           cluster is down. It is essential that the node identifiers in the
1212           configuration and in the device metadata are changed consistently
1213           on all hosts. To change the metadata, dump the current state with
1214           drbdmeta dump-md, adjust the bitmap slot assignment, and update the
1215           metadata with drbdmeta restore-md.
1216
1217           The node-id parameter exists since DRBD 9. Its value ranges from 0
1218           to 16; there is no default.
1219
1220   Section options Parameters (Resource Options)
1221       auto-promote bool-value
1222           A resource must be promoted to primary role before any of its
1223           devices can be mounted or opened for writing.
1224
1225           Before DRBD 9, this could only be done explicitly ("drbdadm
1226           primary"). Since DRBD 9, the auto-promote parameter allows to
1227           automatically promote a resource to primary role when one of its
1228           devices is mounted or opened for writing. As soon as all devices
1229           are unmounted or closed with no more remaining users, the role of
1230           the resource changes back to secondary.
1231
1232           Automatic promotion only succeeds if the cluster state allows it
1233           (that is, if an explicit drbdadm primary command would succeed).
1234           Otherwise, mounting or opening the device fails as it already did
1235           before DRBD 9: the mount(2) system call fails with errno set to
1236           EROFS (Read-only file system); the open(2) system call fails with
1237           errno set to EMEDIUMTYPE (wrong medium type).
1238
1239           Irrespective of the auto-promote parameter, if a device is promoted
1240           explicitly (drbdadm primary), it also needs to be demoted
1241           explicitly (drbdadm secondary).
1242
1243           The auto-promote parameter is available since DRBD 9.0.0, and
1244           defaults to yes.
1245
1246       cpu-mask cpu-mask
1247
1248           Set the cpu affinity mask for DRBD kernel threads. The cpu mask is
1249           specified as a hexadecimal number. The default value is 0, which
1250           lets the scheduler decide which kernel threads run on which CPUs.
1251           CPU numbers in cpu-mask which do not exist in the system are
1252           ignored.
1253
1254       on-no-data-accessible policy
1255           Determine how to deal with I/O requests when the requested data is
1256           not available locally or remotely (for example, when all disks have
1257           failed). The defined policies are:
1258
1259           io-error
1260               System calls fail with errno set to EIO.
1261
1262           suspend-io
1263               The resource suspends I/O. I/O can be resumed by (re)attaching
1264               the lower-level device, by connecting to a peer which has
1265               access to the data, or by forcing DRBD to resume I/O with
1266               drbdadm resume-io res. When no data is available, forcing I/O
1267               to resume will result in the same behavior as the io-error
1268               policy.
1269
1270           This setting is available since DRBD 8.3.9; the default policy is
1271           io-error.
1272
1273       peer-ack-window value
1274
1275           On each node and for each device, DRBD maintains a bitmap of the
1276           differences between the local and remote data for each peer device.
1277           For example, in a three-node setup (nodes A, B, C) each with a
1278           single device, every node maintains one bitmap for each of its
1279           peers.
1280
1281           When nodes receive write requests, they know how to update the
1282           bitmaps for the writing node, but not how to update the bitmaps
1283           between themselves. In this example, when a write request
1284           propagates from node A to B and C, nodes B and C know that they
1285           have the same data as node A, but not whether or not they both have
1286           the same data.
1287
1288           As a remedy, the writing node occasionally sends peer-ack packets
1289           to its peers which tell them which state they are in relative to
1290           each other.
1291
1292           The peer-ack-window parameter specifies how much data a primary
1293           node may send before sending a peer-ack packet. A low value causes
1294           increased network traffic; a high value causes less network traffic
1295           but higher memory consumption on secondary nodes and higher resync
1296           times between the secondary nodes after primary node failures.
1297           (Note: peer-ack packets may be sent due to other reasons as well,
1298           e.g. membership changes or expiry of the peer-ack-delay timer.)
1299
1300           The default value for peer-ack-window is 2 MiB, the default unit is
1301           sectors. This option is available since 9.0.0.
1302
1303       peer-ack-delay expiry-time
1304
1305           If after the last finished write request no new write request gets
1306           issued for expiry-time, then a peer-ack packet is sent. If a new
1307           write request is issued before the timer expires, the timer gets
1308           reset to expiry-time. (Note: peer-ack packets may be sent due to
1309           other reasons as well, e.g. membership changes or the
1310           peer-ack-window option.)
1311
1312           This parameter may influence resync behavior on remote nodes. Peer
1313           nodes need to wait until they receive an peer-ack for releasing a
1314           lock on an AL-extent. Resync operations between peers may need to
1315           wait for for these locks.
1316
1317           The default value for peer-ack-delay is 100 milliseconds, the
1318           default unit is milliseconds. This option is available since 9.0.0.
1319
1320       quorum value
1321
1322           When activated, a cluster partition requires quorum in order to
1323           modify the replicated data set. That means a node in the cluster
1324           partition can only be promoted to primary if the cluster partition
1325           has quorum. Every node with a disk directly connected to the node
1326           that should be promoted counts. If a primary node should execute a
1327           write request, but the cluster partition has lost quorum, it will
1328           freeze IO or reject the write request with an error (depending on
1329           the on-no-quorum setting). Upon loosing quorum a primary always
1330           invokes the quorum-lost handler. The handler is intended for
1331           notification purposes, its return code is ignored.
1332
1333           The option's value might be set to off, majority, all or a numeric
1334           value. If you set it to a numeric value, make sure that the value
1335           is greater than half of your number of nodes. Quorum is a mechanism
1336           to avoid data divergence, it might be used instead of fencing when
1337           there are more than two repicas. It defaults to off
1338
1339           If all missing nodes are marked as outdated, a partition always has
1340           quorum, no matter how small it is. I.e. If you disconnect all
1341           secondary nodes gracefully a single primary continues to operate.
1342           In the moment a single secondary is lost, it has to be assumed that
1343           it forms a partition with all the missing outdated nodes. In case
1344           my partition might be smaller than the other, quorum is lost in
1345           this moment.
1346
1347           In case you want to allow permanently diskless nodes to gain quorum
1348           it is recommendet to not use majority or all. It is recommended to
1349           specify an absolute number, since DBRD's heuristic to determine the
1350           complete number of diskfull nodes in the cluster is unreliable.
1351
1352           The quorum implementation is available starting with the DRBD
1353           kernel driver version 9.0.7.
1354
1355       quorum-minimum-redundancy value
1356
1357           This option sets the minimal required number of nodes with an
1358           UpToDate disk to allow the partition to gain quorum. This is a
1359           different requirement than the plain quorum option expresses.
1360
1361           The option's value might be set to off, majority, all or a numeric
1362           value. If you set it to a numeric value, make sure that the value
1363           is greater than half of your number of nodes.
1364
1365           In case you want to allow permanently diskless nodes to gain quorum
1366           it is recommendet to not use majority or all. It is recommended to
1367           specify an absolute number, since DBRD's heuristic to determine the
1368           complete number of diskfull nodes in the cluster is unreliable.
1369
1370           This option is available starting with the DRBD kernel driver
1371           version 9.0.10.
1372
1373       on-no-quorum {io-error | suspend-io}
1374
1375           By default DRBD freezes IO on a device, that lost quorum. By
1376           setting the on-no-quorum to io-error it completes all IO operations
1377           with an error if quorum ist lost.
1378
1379           The on-no-quorum options is available starting with the DRBD kernel
1380           driver version 9.0.8.
1381
1382   Section startup Parameters
1383       The parameters in this section define the behavior of DRBD at system
1384       startup time, in the DRBD init script. They have no effect once the
1385       system is up and running.
1386
1387       degr-wfc-timeout timeout
1388
1389           Define how long to wait until all peers are connected in case the
1390           cluster consisted of a single node only when the system went down.
1391           This parameter is usually set to a value smaller than wfc-timeout.
1392           The assumption here is that peers which were unreachable before a
1393           reboot are less likely to be reachable after the reboot, so waiting
1394           is less likely to help.
1395
1396           The timeout is specified in seconds. The default value is 0, which
1397           stands for an infinite timeout. Also see the wfc-timeout parameter.
1398
1399       outdated-wfc-timeout timeout
1400
1401           Define how long to wait until all peers are connected if all peers
1402           were outdated when the system went down. This parameter is usually
1403           set to a value smaller than wfc-timeout. The assumption here is
1404           that an outdated peer cannot have become primary in the meantime,
1405           so we don't need to wait for it as long as for a node which was
1406           alive before.
1407
1408           The timeout is specified in seconds. The default value is 0, which
1409           stands for an infinite timeout. Also see the wfc-timeout parameter.
1410
1411       stacked-timeouts
1412           On stacked devices, the wfc-timeout and degr-wfc-timeout parameters
1413           in the configuration are usually ignored, and both timeouts are set
1414           to twice the connect-int timeout. The stacked-timeouts parameter
1415           tells DRBD to use the wfc-timeout and degr-wfc-timeout parameters
1416           as defined in the configuration, even on stacked devices. Only use
1417           this parameter if the peer of the stacked resource is usually not
1418           available, or will not become primary. Incorrect use of this
1419           parameter can lead to unexpected split-brain scenarios.
1420
1421       wait-after-sb
1422           This parameter causes DRBD to continue waiting in the init script
1423           even when a split-brain situation has been detected, and the nodes
1424           therefore refuse to connect to each other.
1425
1426       wfc-timeout timeout
1427
1428           Define how long the init script waits until all peers are
1429           connected. This can be useful in combination with a cluster manager
1430           which cannot manage DRBD resources: when the cluster manager
1431           starts, the DRBD resources will already be up and running. With a
1432           more capable cluster manager such as Pacemaker, it makes more sense
1433           to let the cluster manager control DRBD resources. The timeout is
1434           specified in seconds. The default value is 0, which stands for an
1435           infinite timeout. Also see the degr-wfc-timeout parameter.
1436
1437   Section volume Parameters
1438       device /dev/drbdminor-number
1439
1440           Define the device name and minor number of a replicated block
1441           device. This is the device that applications are supposed to
1442           access; in most cases, the device is not used directly, but as a
1443           file system. This parameter is required and the standard device
1444           naming convention is assumed.
1445
1446           In addition to this device, udev will create
1447           /dev/drbd/by-res/resource/volume and
1448           /dev/drbd/by-disk/lower-level-device symlinks to the device.
1449
1450       disk {[disk] | none}
1451
1452           Define the lower-level block device that DRBD will use for storing
1453           the actual data. While the replicated drbd device is configured,
1454           the lower-level device must not be used directly. Even read-only
1455           access with tools like dumpe2fs(8) and similar is not allowed. The
1456           keyword none specifies that no lower-level block device is
1457           configured; this also overrides inheritance of the lower-level
1458           device.
1459
1460       meta-disk internal,
1461       meta-disk device,
1462       meta-disk device [index]
1463
1464           Define where the metadata of a replicated block device resides: it
1465           can be internal, meaning that the lower-level device contains both
1466           the data and the metadata, or on a separate device.
1467
1468           When the index form of this parameter is used, multiple replicated
1469           devices can share the same metadata device, each using a separate
1470           index. Each index occupies 128 MiB of data, which corresponds to a
1471           replicated device size of at most 4 TiB with two cluster nodes. We
1472           recommend not to share metadata devices anymore, and to instead use
1473           the lvm volume manager for creating metadata devices as needed.
1474
1475           When the index form of this parameter is not used, the size of the
1476           lower-level device determines the size of the metadata. The size
1477           needed is 36 KiB + (size of lower-level device) / 32K * (number of
1478           nodes - 1). If the metadata device is bigger than that, the extra
1479           space is not used.
1480
1481           This parameter is required if a disk other than none is specified,
1482           and ignored if disk is set to none. A meta-disk parameter without a
1483           disk parameter is not allowed.
1484

NOTES ON DATA INTEGRITY

1486       DRBD supports two different mechanisms for data integrity checking:
1487       first, the data-integrity-alg network parameter allows to add a
1488       checksum to the data sent over the network. Second, the online
1489       verification mechanism (drbdadm verify and the verify-alg parameter)
1490       allows to check for differences in the on-disk data.
1491
1492       Both mechanisms can produce false positives if the data is modified
1493       during I/O (i.e., while it is being sent over the network or written to
1494       disk). This does not always indicate a problem: for example, some file
1495       systems and applications do modify data under I/O for certain
1496       operations. Swap space can also undergo changes while under I/O.
1497
1498       Network data integrity checking tries to identify data modification
1499       during I/O by verifying the checksums on the sender side after sending
1500       the data. If it detects a mismatch, it logs an error. The receiver also
1501       logs an error when it detects a mismatch. Thus, an error logged only on
1502       the receiver side indicates an error on the network, and an error
1503       logged on both sides indicates data modification under I/O.
1504
1505       The most recent example of systematic data corruption was identified as
1506       a bug in the TCP offloading engine and driver of a certain type of GBit
1507       NIC in 2007: the data corruption happened on the DMA transfer from core
1508       memory to the card. Because the TCP checksum were calculated on the
1509       card, the TCP/IP protocol checksums did not reveal this problem.
1510

VERSION

1512       This document was revised for version 9.0.0 of the DRBD distribution.
1513

AUTHOR

1515       Written by Philipp Reisner <philipp.reisner@linbit.com> and Lars
1516       Ellenberg <lars.ellenberg@linbit.com>.
1517

REPORTING BUGS

1519       Report bugs to <drbd-user@lists.linbit.com>.
1520
1522       Copyright 2001-2018 LINBIT Information Technologies, Philipp Reisner,
1523       Lars Ellenberg. This is free software; see the source for copying
1524       conditions. There is NO warranty; not even for MERCHANTABILITY or
1525       FITNESS FOR A PARTICULAR PURPOSE.
1526

SEE ALSO

1528       drbd(8), drbdsetup(8), drbdadm(8), DRBD User's Guide[1], DRBD Web
1529       Site[3]
1530

NOTES

1532        1. DRBD User's Guide
1533           http://www.drbd.org/users-guide/
1534
1535        2.
1536
1537                 Online Usage Counter
1538           http://usage.drbd.org
1539
1540        3. DRBD Web Site
1541           http://www.drbd.org/
1542
1543
1544
1545DRBD 9.0.x                      17 January 2018                   DRBD.CONF(5)
Impressum