1BTRFS-MAN5(5)                    Btrfs Manual                    BTRFS-MAN5(5)
2
3
4

NAME

6       btrfs-man5 - topics about the BTRFS filesystem (mount options,
7       supported file attributes and other)
8

DESCRIPTION

10       This document describes topics related to BTRFS that are not specific
11       to the tools. Currently covers:
12
13        1. mount options
14
15        2. filesystem features
16
17        3. checksum algorithms
18
19        4. filesystem limits
20
21        5. bootloader support
22
23        6. file attributes
24
25        7. control device
26

MOUNT OPTIONS

28       This section describes mount options specific to BTRFS. For the generic
29       mount options please refer to mount(8) manpage. The options are sorted
30       alphabetically (discarding the no prefix).
31
32           Note
33           most mount options apply to the whole filesystem and only options
34           in the first mounted subvolume will take effect. This is due to
35           lack of implementation and may change in the future. This means
36           that (for example) you can’t set per-subvolume nodatacow,
37           nodatasum, or compress using mount options. This should eventually
38           be fixed, but it has proved to be difficult to implement correctly
39           within the Linux VFS framework.
40
41       acl, noacl
42           (default: on)
43
44           Enable/disable support for Posix Access Control Lists (ACLs). See
45           the acl(5) manual page for more information about ACLs.
46
47           The support for ACL is build-time configurable (BTRFS_FS_POSIX_ACL)
48           and mount fails if acl is requested but the feature is not compiled
49           in.
50
51       autodefrag, noautodefrag
52           (since: 3.0, default: off)
53
54           Enable automatic file defragmentation. When enabled, small random
55           writes into files (in a range of tens of kilobytes, currently it’s
56           64K) are detected and queued up for the defragmentation process.
57           Not well suited for large database workloads.
58
59           The read latency may increase due to reading the adjacent blocks
60           that make up the range for defragmentation, successive write will
61           merge the blocks in the new location.
62
63               Warning
64               Defragmenting with Linux kernel versions < 3.9 or ≥ 3.14-rc2 as
65               well as with Linux stable kernel versions ≥ 3.10.31, ≥ 3.12.12
66               or ≥ 3.13.4 will break up the reflinks of COW data (for example
67               files copied with cp --reflink, snapshots or de-duplicated
68               data). This may cause considerable increase of space usage
69               depending on the broken up reflinks.
70
71       barrier, nobarrier
72           (default: on)
73
74           Ensure that all IO write operations make it through the device
75           cache and are stored permanently when the filesystem is at its
76           consistency checkpoint. This typically means that a flush command
77           is sent to the device that will synchronize all pending data and
78           ordinary metadata blocks, then writes the superblock and issues
79           another flush.
80
81           The write flushes incur a slight hit and also prevent the IO block
82           scheduler to reorder requests in a more effective way. Disabling
83           barriers gets rid of that penalty but will most certainly lead to a
84           corrupted filesystem in case of a crash or power loss. The ordinary
85           metadata blocks could be yet unwritten at the time the new
86           superblock is stored permanently, expecting that the block pointers
87           to metadata were stored permanently before.
88
89           On a device with a volatile battery-backed write-back cache, the
90           nobarrier option will not lead to filesystem corruption as the
91           pending blocks are supposed to make it to the permanent storage.
92
93       check_int, check_int_data, check_int_print_mask=value
94           (since: 3.0, default: off)
95
96           These debugging options control the behavior of the integrity
97           checking module (the BTRFS_FS_CHECK_INTEGRITY config option
98           required). The main goal is to verify that all blocks from a given
99           transaction period are properly linked.
100
101           check_int enables the integrity checker module, which examines all
102           block write requests to ensure on-disk consistency, at a large
103           memory and CPU cost.
104
105           check_int_data includes extent data in the integrity checks, and
106           implies the check_int option.
107
108           check_int_print_mask takes a bitmask of BTRFSIC_PRINT_MASK_* values
109           as defined in fs/btrfs/check-integrity.c, to control the integrity
110           checker module behavior.
111
112           See comments at the top of fs/btrfs/check-integrity.c for more
113           information.
114
115       clear_cache
116           Force clearing and rebuilding of the disk space cache if something
117           has gone wrong. See also: space_cache.
118
119       commit=seconds
120           (since: 3.12, default: 30)
121
122           Set the interval of periodic transaction commit when data are
123           synchronized to permanent storage. Higher interval values lead to
124           larger amount of unwritten data, which has obvious consequences
125           when the system crashes. The upper bound is not forced, but a
126           warning is printed if it’s more than 300 seconds (5 minutes). Use
127           with care.
128
129       compress, compress=type[:level], compress-force,
130       compress-force=type[:level]
131           (default: off, level support since: 5.1)
132
133           Control BTRFS file data compression. Type may be specified as zlib,
134           lzo, zstd or no (for no compression, used for remounting). If no
135           type is specified, zlib is used. If compress-force is specified,
136           then compression will always be attempted, but the data may end up
137           uncompressed if the compression would make them larger.
138
139           Both zlib and zstd (since version 5.1) expose the compression level
140           as a tunable knob with higher levels trading speed and memory
141           (zstd) for higher compression ratios. This can be set by appending
142           a colon and the desired level. Zlib accepts the range [1, 9] and
143           zstd accepts [1, 15]. If no level is set, both currently use a
144           default level of 3. The value 0 is an alias for the defaul level.
145
146           Otherwise some simple heuristics are applied to detect an
147           incompressible file. If the first blocks written to a file are not
148           compressible, the whole file is permanently marked to skip
149           compression. As this is too simple, the compress-force is a
150           workaround that will compress most of the files at the cost of some
151           wasted CPU cycles on failed attempts. Since kernel 4.15, a set of
152           heuristic algorithms have been improved by using frequency
153           sampling, repeated pattern detection and Shannon entropy
154           calculation to avoid that.
155
156               Note
157               If compression is enabled, nodatacow and nodatasum are
158               disabled.
159
160       datacow, nodatacow
161           (default: on)
162
163           Enable data copy-on-write for newly created files.  Nodatacow
164           implies nodatasum, and disables compression. All files created
165           under nodatacow are also set the NOCOW file attribute (see
166           chattr(1)).
167
168               Note
169               If nodatacow or nodatasum are enabled, compression is disabled.
170           Updates in-place improve performance for workloads that do frequent
171           overwrites, at the cost of potential partial writes, in case the
172           write is interrupted (system crash, device failure).
173
174       datasum, nodatasum
175           (default: on)
176
177           Enable data checksumming for newly created files.  Datasum implies
178           datacow, ie. the normal mode of operation. All files created under
179           nodatasum inherit the "no checksums" property, however there’s no
180           corresponding file attribute (see chattr(1)).
181
182               Note
183               If nodatacow or nodatasum are enabled, compression is disabled.
184           There is a slight performance gain when checksums are turned off,
185           the corresponding metadata blocks holding the checksums do not need
186           to updated. The cost of checksumming of the blocks in memory is
187           much lower than the IO, modern CPUs feature hardware support of the
188           checksumming algorithm.
189
190       degraded
191           (default: off)
192
193           Allow mounts with less devices than the RAID profile constraints
194           require. A read-write mount (or remount) may fail when there are
195           too many devices missing, for example if a stripe member is
196           completely missing from RAID0.
197
198           Since 4.14, the constraint checks have been improved and are
199           verified on the chunk level, not an the device level. This allows
200           degraded mounts of filesystems with mixed RAID profiles for data
201           and metadata, even if the device number constraints would not be
202           satisfied for some of the profiles.
203
204           Example: metadata — raid1, data — single, devices — /dev/sda,
205           /dev/sdb
206
207           Suppose the data are completely stored on sda, then missing sdb
208           will not prevent the mount, even if 1 missing device would normally
209           prevent (any) single profile to mount. In case some of the data
210           chunks are stored on sdb, then the constraint of single/data is not
211           satisfied and the filesystem cannot be mounted.
212
213       device=devicepath
214           Specify a path to a device that will be scanned for BTRFS
215           filesystem during mount. This is usually done automatically by a
216           device manager (like udev) or using the btrfs device scan command
217           (eg. run from the initial ramdisk). In cases where this is not
218           possible the device mount option can help.
219
220               Note
221               booting eg. a RAID1 system may fail even if all filesystem’s
222               device paths are provided as the actual device nodes may not be
223               discovered by the system at that point.
224
225       discard, nodiscard
226           (default: off)
227
228           Enable discarding of freed file blocks. This is useful for SSD
229           devices, thinly provisioned LUNs, or virtual machine images;
230           however, every storage layer must support discard for it to work.
231           if the backing device does not support asynchronous queued TRIM,
232           then this operation can severely degrade performance, because a
233           synchronous TRIM operation will be attempted instead. Queued TRIM
234           requires newer than SATA revision 3.1 chipsets and devices.
235
236           If it is not necessary to immediately discard freed blocks, then
237           the fstrim tool can be used to discard all free blocks in a batch.
238           Scheduling a TRIM during a period of low system activity will
239           prevent latent interference with the performance of other
240           operations. Also, a device may ignore the TRIM command if the range
241           is too small, so running a batch discard has a greater probability
242           of actually discarding the blocks.
243
244       enospc_debug, noenospc_debug
245           (default: off)
246
247           Enable verbose output for some ENOSPC conditions. It’s safe to use
248           but can be noisy if the system reaches near-full state.
249
250       fatal_errors=action
251           (since: 3.4, default: bug)
252
253           Action to take when encountering a fatal error.
254
255           bug
256               BUG() on a fatal error, the system will stay in the crashed
257               state and may be still partially usable, but reboot is required
258               for full operation
259
260           panic
261               panic() on a fatal error, depending on other system
262               configuration, this may be followed by a reboot. Please refer
263               to the documentation of kernel boot parameters, eg.  panic,
264               oops or crashkernel.
265
266       flushoncommit, noflushoncommit
267           (default: off)
268
269           This option forces any data dirtied by a write in a prior
270           transaction to commit as part of the current commit, effectively a
271           full filesystem sync.
272
273           This makes the committed state a fully consistent view of the file
274           system from the application’s perspective (i.e. it includes all
275           completed file system operations). This was previously the behavior
276           only when a snapshot was created.
277
278           When off, the filesystem is consistent but buffered writes may last
279           more than one transaction commit.
280
281       fragment=type
282           (depends on compile-time option BTRFS_DEBUG, since: 4.4, default:
283           off)
284
285           A debugging helper to intentionally fragment given type of block
286           groups. The type can be data, metadata or all. This mount option
287           should not be used outside of debugging environments and is not
288           recognized if the kernel config option BTRFS_DEBUG is not enabled.
289
290       inode_cache, noinode_cache
291           (since: 3.0, default: off)
292
293           Enable free inode number caching. Not recommended to use unless
294           files on your filesystem get assigned inode numbers that are
295           approaching 2^64. Normally, new files in each subvolume get
296           assigned incrementally (plus one from the last time) and are not
297           reused. The mount option turns on caching of the existing inode
298           numbers and reuse of inode numbers of deleted files.
299
300           This option may slow down your system at first run, or after
301           mounting without the option.
302
303               Note
304               Defaults to off due to a potential overflow problem when the
305               free space checksums don’t fit inside a single page.
306           Don’t use this option unless you really need it. The inode number
307           limit on 64bit system is 2^64, which is practically enough for the
308           whole filesystem lifetime. Due to implementation of linux VFS
309           layer, the inode numbers on 32bit systems are only 32 bits wide.
310           This lowers the limit significantly and makes it possible to reach
311           it. In such case, this mount option will help. Alternatively, files
312           with high inode numbers can be copied to a new subvolume which will
313           effectively start the inode numbers from the beginning again.
314
315       logreplay, nologreplay
316           (default: on, even read-only)
317
318           Enable/disable log replay at mount time. See also treelog. Note
319           that nologreplay is the same as norecovery.
320
321               Warning
322               currently, the tree log is replayed even with a read-only
323               mount! To disable that behaviour, mount also with nologreplay.
324
325       max_inline=bytes
326           (default: min(2048, page size) )
327
328           Specify the maximum amount of space, that can be inlined in a
329           metadata B-tree leaf. The value is specified in bytes, optionally
330           with a K suffix (case insensitive). In practice, this value is
331           limited by the filesystem block size (named sectorsize at mkfs
332           time), and memory page size of the system. In case of sectorsize
333           limit, there’s some space unavailable due to leaf headers. For
334           example, a 4k sectorsize, maximum size of inline data is about 3900
335           bytes.
336
337           Inlining can be completely turned off by specifying 0. This will
338           increase data block slack if file sizes are much smaller than block
339           size but will reduce metadata consumption in return.
340
341               Note
342               the default value has changed to 2048 in kernel 4.6.
343
344       metadata_ratio=value
345           (default: 0, internal logic)
346
347           Specifies that 1 metadata chunk should be allocated after every
348           value data chunks. Default behaviour depends on internal logic,
349           some percent of unused metadata space is attempted to be maintained
350           but is not always possible if there’s not enough space left for
351           chunk allocation. The option could be useful to override the
352           internal logic in favor of the metadata allocation if the expected
353           workload is supposed to be metadata intense (snapshots, reflinks,
354           xattrs, inlined files).
355
356       norecovery
357           (since: 4.5, default: off)
358
359           Do not attempt any data recovery at mount time. This will disable
360           logreplay and avoids other write operations. Note that this option
361           is the same as nologreplay.
362
363               Note
364               The opposite option recovery used to have different meaning but
365               was changed for consistency with other filesystems, where
366               norecovery is used for skipping log replay. BTRFS does the same
367               and in general will try to avoid any write operations.
368
369       rescan_uuid_tree
370           (since: 3.12, default: off)
371
372           Force check and rebuild procedure of the UUID tree. This should not
373           normally be needed.
374
375       skip_balance
376           (since: 3.3, default: off)
377
378           Skip automatic resume of an interrupted balance operation. The
379           operation can later be resumed with btrfs balance resume, or the
380           paused state can be removed with btrfs balance cancel. The default
381           behaviour is to resume an interrupted balance immediately after a
382           volume is mounted.
383
384       space_cache, space_cache=version, nospace_cache
385           (nospace_cache since: 3.2, space_cache=v1 and space_cache=v2 since
386           4.5, default: space_cache=v1)
387
388           Options to control the free space cache. The free space cache
389           greatly improves performance when reading block group free space
390           into memory. However, managing the space cache consumes some
391           resources, including a small amount of disk space.
392
393           There are two implementations of the free space cache. The original
394           one, referred to as v1, is the safe default. The v1 space cache can
395           be disabled at mount time with nospace_cache without clearing.
396
397           On very large filesystems (many terabytes) and certain workloads,
398           the performance of the v1 space cache may degrade drastically. The
399           v2 implementation, which adds a new B-tree called the free space
400           tree, addresses this issue. Once enabled, the v2 space cache will
401           always be used and cannot be disabled unless it is cleared. Use
402           clear_cache,space_cache=v1 or clear_cache,nospace_cache to do so.
403           If v2 is enabled, kernels without v2 support will only be able to
404           mount the filesystem in read-only mode. The btrfs(8) command
405           currently only has read-only support for v2. A read-write command
406           may be run on a v2 filesystem by clearing the cache, running the
407           command, and then remounting with space_cache=v2.
408
409           If a version is not explicitly specified, the default
410           implementation will be chosen, which is v1.
411
412       ssd, ssd_spread, nossd, nossd_spread
413           (default: SSD autodetected)
414
415           Options to control SSD allocation schemes. By default, BTRFS will
416           enable or disable SSD optimizations depending on status of a device
417           with respect to rotational or non-rotational type. This is
418           determined by the contents of /sys/block/DEV/queue/rotational). If
419           it is 0, the ssd option is turned on. The option nossd will disable
420           the autodetection.
421
422           The optimizations make use of the absence of the seek penalty
423           that’s inherent for the rotational devices. The blocks can be
424           typically written faster and are not offloaded to separate threads.
425
426               Note
427               Since 4.14, the block layout optimizations have been dropped.
428               This used to help with first generations of SSD devices. Their
429               FTL (flash translation layer) was not effective and the
430               optimization was supposed to improve the wear by better
431               aligning blocks. This is no longer true with modern SSD devices
432               and the optimization had no real benefit. Furthermore it caused
433               increased fragmentation. The layout tuning has been kept intact
434               for the option ssd_spread.
435           The ssd_spread mount option attempts to allocate into bigger and
436           aligned chunks of unused space, and may perform better on low-end
437           SSDs.  ssd_spread implies ssd, enabling all other SSD heuristics as
438           well. The option nossd will disable all SSD options while
439           nossd_spread only disables ssd_spread.
440
441       subvol=path
442           Mount subvolume from path rather than the toplevel subvolume. The
443           path is always treated as relative to the toplevel subvolume. This
444           mount option overrides the default subvolume set for the given
445           filesystem.
446
447       subvolid=subvolid
448           Mount subvolume specified by a subvolid number rather than the
449           toplevel subvolume. You can use btrfs subvolume list of btrfs
450           subvolume show to see subvolume ID numbers. This mount option
451           overrides the default subvolume set for the given filesystem.
452
453               Note
454               if both subvolid and subvol are specified, they must point at
455               the same subvolume, otherwise the mount will fail.
456
457       thread_pool=number
458           (default: min(NRCPUS + 2, 8) )
459
460           The number of worker threads to start. NRCPUS is number of on-line
461           CPUs detected at the time of mount. Small number leads to less
462           parallelism in processing data and metadata, higher numbers could
463           lead to a performance hit due to increased locking contention,
464           process scheduling, cache-line bouncing or costly data transfers
465           between local CPU memories.
466
467       treelog, notreelog
468           (default: on)
469
470           Enable the tree logging used for fsync and O_SYNC writes. The tree
471           log stores changes without the need of a full filesystem sync. The
472           log operations are flushed at sync and transaction commit. If the
473           system crashes between two such syncs, the pending tree log
474           operations are replayed during mount.
475
476               Warning
477               currently, the tree log is replayed even with a read-only
478               mount! To disable that behaviour, also mount with nologreplay.
479           The tree log could contain new files/directories, these would not
480           exist on a mounted filesystem if the log is not replayed.
481
482       usebackuproot, nousebackuproot
483           (since: 4.6, default: off)
484
485           Enable autorecovery attempts if a bad tree root is found at mount
486           time. Currently this scans a backup list of several previous tree
487           roots and tries to use the first readable. This can be used with
488           read-only mounts as well.
489
490               Note
491               This option has replaced recovery.
492
493       user_subvol_rm_allowed
494           (default: off)
495
496           Allow subvolumes to be deleted by their respective owner.
497           Otherwise, only the root user can do that.
498
499               Note
500               historically, any user could create a snapshot even if he was
501               not owner of the source subvolume, the subvolume deletion has
502               been restricted for that reason. The subvolume creation has
503               been restricted but this mount option is still required. This
504               is a usability issue. Since 4.18, the rmdir(2) syscall can
505               delete an empty subvolume just like an ordinary directory.
506               Whether this is possible can be detected at runtime, see
507               rmdir_subvol feature in FILESYSTEM FEATURES.
508
509   DEPRECATED MOUNT OPTIONS
510       List of mount options that have been removed, kept for backward
511       compatibility.
512
513       alloc_start=bytes
514           (default: 1M, minimum: 1M, deprecated since: 4.13)
515
516           Debugging option to force all block allocations above a certain
517           byte threshold on each block device. The value is specified in
518           bytes, optionally with a K, M, or G suffix (case insensitive).
519
520       recovery
521           (since: 3.2, default: off, deprecated since: 4.5)
522
523               Note
524               this option has been replaced by usebackuproot and should not
525               be used but will work on 4.5+ kernels.
526
527       subvolrootid=objectid
528           (irrelevant since: 3.2, formally deprecated since: 3.10)
529
530           A workaround option from times (pre 3.2) when it was not possible
531           to mount a subvolume that did not reside directly under the
532           toplevel subvolume.
533
534   NOTES ON GENERIC MOUNT OPTIONS
535       Some of the general mount options from mount(8) that affect BTRFS and
536       are worth mentioning.
537
538       noatime
539           under read intensive work-loads, specifying noatime significantly
540           improves performance because no new access time information needs
541           to be written. Without this option, the default is relatime, which
542           only reduces the number of inode atime updates in comparison to the
543           traditional strictatime. The worst case for atime updates under
544           relatime occurs when many files are read whose atime is older than
545           24 h and which are freshly snapshotted. In that case the atime is
546           updated and COW happens - for each file - in bulk. See also
547           https://lwn.net/Articles/499293/ - Atime and btrfs: a bad
548           combination? (LWN, 2012-05-31).
549
550           Note that noatime may break applications that rely on atime uptimes
551           like the venerable Mutt (unless you use maildir mailboxes).
552

FILESYSTEM FEATURES

554       The basic set of filesystem features gets extended over time. The
555       backward compatibility is maintained and the features are optional,
556       need to be explicitly asked for so accidental use will not create
557       incompatibilities.
558
559       There are several classes and the respective tools to manage the
560       features:
561
562       at mkfs time only
563           This is namely for core structures, like the b-tree nodesize or
564           checksum algorithm, see mkfs.btrfs(8) for more details.
565
566       after mkfs, on an unmounted filesystem
567           Features that may optimize internal structures or add new
568           structures to support new functionality, see btrfstune(8). The
569           command btrfs inspect-internal dump-super device will dump a
570           superblock, you can map the value of incompat_flags to the features
571           listed below
572
573       after mkfs, on a mounted filesystem
574           The features of a filesystem (with a given UUID) are listed in
575           /sys/fs/btrfs/UUID/features/, one file per feature. The status is
576           stored inside the file. The value 1 is for enabled and active,
577           while 0 means the feature was enabled at mount time but turned off
578           afterwards.
579
580           Whether a particular feature can be turned on a mounted filesystem
581           can be found in the directory /sys/fs/btrfs/features/, one file per
582           feature. The value 1 means the feature can be enabled.
583
584       List of features (see also mkfs.btrfs(8) section FILESYSTEM FEATURES):
585
586       big_metadata
587           (since: 3.4)
588
589           the filesystem uses nodesize for metadata blocks, this can be
590           bigger than the page size
591
592       compress_lzo
593           (since: 2.6.38)
594
595           the lzo compression has been used on the filesystem, either as a
596           mount option or via btrfs filesystem defrag.
597
598       compress_zstd
599           (since: 4.14)
600
601           the zstd compression has been used on the filesystem, either as a
602           mount option or via btrfs filesystem defrag.
603
604       default_subvol
605           (since: 2.6.34)
606
607           the default subvolume has been set on the filesystem
608
609       extended_iref
610           (since: 3.7)
611
612           increased hardlink limit per file in a directory to 65536, older
613           kernels supported a varying number of hardlinks depending on the
614           sum of all file name sizes that can be stored into one metadata
615           block
616
617       metadata_uuid
618           (since: 5.0)
619
620           the main filesystem UUID is the metadata_uuid, which stores the new
621           UUID only in the superblock while all metadata blocks still have
622           the UUID set at mkfs time, see btrfstune(8) for more
623
624       mixed_backref
625           (since: 2.6.31)
626
627           the last major disk format change, improved backreferences, now
628           default
629
630       mixed_groups
631           (since: 2.6.37)
632
633           mixed data and metadata block groups, ie. the data and metadata are
634           not separated and occupy the same block groups, this mode is
635           suitable for small volumes as there are no constraints how the
636           remaining space should be used (compared to the split mode, where
637           empty metadata space cannot be used for data and vice versa)
638
639           on the other hand, the final layout is quite unpredictable and
640           possibly highly fragmented, which means worse performance
641
642       no_holes
643           (since: 3.14)
644
645           improved representation of file extents where holes are not
646           explicitly stored as an extent, saves a few percent of metadata if
647           sparse files are used
648
649       raid56
650           (since: 3.9)
651
652           the filesystem contains or contained a raid56 profile of block
653           groups
654
655       rmdir_subvol
656           (since: 4.18)
657
658           indicate that rmdir(2) syscall can delete an empty subvolume just
659           like an ordinary directory. Note that this feature only depends on
660           the kernel version.
661
662       skinny_metadata
663           (since: 3.10)
664
665           reduced-size metadata for extent references, saves a few percent of
666           metadata
667
668   SWAPFILE SUPPORT
669       The swapfile is supported since kernel 5.0. Use swapon(8) to activate
670       the swapfile. There are some limitations of the implementation in btrfs
671       and linux swap subystem:
672
673       ·   filesystem - must be only single device
674
675       ·   swapfile - the containing subvolume cannot be snapshotted
676
677       ·   swapfile - must be preallocated
678
679       ·   swapfile - must be nodatacow (ie. also nodatasum)
680
681       ·   swapfile - must not be compressed
682
683       The limitations come namely from the COW-based design and mapping layer
684       of blocks that allows the advanced features like relocation and
685       multi-device filesystems. However, the swap subsystem expects simpler
686       mapping and no background changes of the file blocks once they’ve been
687       attached to swap.
688
689       With active swapfiles, the following whole-filesystem operations will
690       skip swapfile extents or may fail:
691
692       ·   balance - block groups with swapfile extents are skipped and
693           reported, the rest will be processed normally
694
695       ·   resize grow - unaffected
696
697       ·   resize shrink - works as long as the extents are outside of the
698           shrunk range
699
700       ·   device add - a new device does not interfere with existing swapfile
701           and this operation will work, though no new swapfile can be
702           activated afterwards
703
704       ·   device delete - if the device has been added as above, it can be
705           also deleted
706
707       ·   device replace - ditto
708
709       When there are no active swapfiles and a whole-filesystem exclusive
710       operation is running (ie. balance, device delete, shrink), the
711       swapfiles cannot be temporarily activated. The operation must finish
712       first.
713
714           # truncate -s 0 swapfile
715           # chattr +C swapfile
716           # fallocate -l 2G swapfile
717           # chmod 0600 swapfile
718           # mkswap swapfile
719           # swapon swapfile
720

CHECKSUM ALGORITHMS

722       There are several checksum algorithms supported. The default and
723       backward compatible is crc32c. Since kernel 5.5 there are three more
724       with different characteristics and trade-offs regarding speed and
725       strength. The following list may help you to decide which one to
726       select.
727
728       CRC32C (32bit digest)
729           default, best backward compatibility, very fast, modern CPUs have
730           instruction-level support, not collision-resistant but still good
731           error detection capabilities
732
733       XXHASH (64bit digest)
734           can be used as CRC32C successor, very fast, optimized for modern
735           CPUs utilizing instruction pipelining, good collision resistance
736           and error detection
737
738       SHA256 (256bit digest)
739           a cryptographic-strength hash, relatively slow but with possible
740           CPU instruction acceleration or specialized hardware cards, FIPS
741           certified and in wide use
742
743       BLAKE2b (256bit digest)
744           a cryptographic-strength hash, relatively fast with possible CPU
745           acceleration using SIMD extensions, not standardized but based on
746           BLAKE which was a SHA3 finalist, in wide use, the algorithm used is
747           BLAKE2b-256 that’s optimized for 64bit platforms
748
749       The digest size affects overall size of data block checksums stored in
750       the filesystem. The metadata blocks have a fixed area up to 256bits (32
751       bytes), so there’s no increase. Each data block has a separate checksum
752       stored, with additional overhead of the b-tree leaves.
753
754       Approximate relative performance of the algorithms, measured against
755       CRC32C using reference software implementations on a 3.5GHz intel CPU:
756
757       ┌────────┬─────────────┬───────┐
758       │        │             │       │
759Digest  Cycles/4KiB Ratio 
760       ├────────┼─────────────┼───────┤
761       │        │             │       │
762       │CRC32C  │        1700 │  1.00 │
763       ├────────┼─────────────┼───────┤
764       │        │             │       │
765       │XXHASH  │        2500 │  1.44 │
766       ├────────┼─────────────┼───────┤
767       │        │             │       │
768       │SHA256  │      105000 │    61 │
769       ├────────┼─────────────┼───────┤
770       │        │             │       │
771       │BLAKE2b │       22000 │    13 │
772       └────────┴─────────────┴───────┘
773

FILESYSTEM LIMITS

775       maximum file name length
776           255
777
778       maximum symlink target length
779           depends on the nodesize value, for 4k it’s 3949 bytes, for larger
780           nodesize it’s 4095 due to the system limit PATH_MAX
781
782           The symlink target may not be a valid path, ie. the path name
783           components can exceed the limits (NAME_MAX), there’s no content
784           validation at symlink(3) creation.
785
786       maximum number of inodes
787           2^64 but depends on the available metadata space as the inodes are
788           created dynamically
789
790       inode numbers
791           minimum number: 256 (for subvolumes), regular files and
792           directories: 257
793
794       maximum file length
795           inherent limit of btrfs is 2^64 (16 EiB) but the linux VFS limit is
796           2^63 (8 EiB)
797
798       maximum number of subvolumes
799           the subvolume ids can go up to 2^64 but the number of actual
800           subvolumes depends on the available metadata space, the space
801           consumed by all subvolume metadata includes bookkeeping of shared
802           extents can be large (MiB, GiB)
803
804       maximum number of hardlinks of a file in a directory
805           65536 when the extref feature is turned on during mkfs (default),
806           roughly 100 otherwise
807

BOOTLOADER SUPPORT

809       GRUB2 (https://www.gnu.org/software/grub) has the most advanced support
810       of booting from BTRFS with respect to features.
811
812       EXTLINUX (from the https://syslinux.org project) can boot but does not
813       support all features. Please check the upstream documentation before
814       you use it.
815

FILE ATTRIBUTES

817       The btrfs filesystem supports setting the following file attributes
818       using the chattr(1) utility:
819
820       a
821           append only, new writes are always written at the end of the file
822
823       A
824           no atime updates
825
826       c
827           compress data, all data written after this attribute is set will be
828           compressed. Please note that compression is also affected by the
829           mount options or the parent directory attributes.
830
831           When set on a directory, all newly created files will inherit this
832           attribute.
833
834       C
835           no copy-on-write, file modifications are done in-place
836
837           When set on a directory, all newly created files will inherit this
838           attribute.
839
840               Note
841               due to implementation limitations, this flag can be set/unset
842               only on empty files.
843
844       d
845           no dump, makes sense with 3rd party tools like dump(8), on BTRFS
846           the attribute can be set/unset but no other special handling is
847           done
848
849       D
850           synchronous directory updates, for more details search open(2) for
851           O_SYNC and O_DSYNC
852
853       i
854           immutable, no file data and metadata changes allowed even to the
855           root user as long as this attribute is set (obviously the exception
856           is unsetting the attribute)
857
858       S
859           synchronous updates, for more details search open(2) for O_SYNC and
860           O_DSYNC
861
862       X
863           no compression, permanently turn off compression on the given file.
864           Any compression mount options will not affect this file.
865
866           When set on a directory, all newly created files will inherit this
867           attribute.
868
869       No other attributes are supported. For the complete list please refer
870       to the chattr(1) manual page.
871

CONTROL DEVICE

873       There’s a character special device /dev/btrfs-control with major and
874       minor numbers 10 and 234 (the device can be found under the misc
875       category).
876
877           $ ls -l /dev/btrfs-control
878           crw------- 1 root root 10, 234 Jan  1 12:00 /dev/btrfs-control
879
880       The device accepts some ioctl calls that can perform following actions
881       on the filesystem module:
882
883       ·   scan devices for btrfs filesystem (ie. to let multi-device
884           filesystems mount automatically) and register them with the kernel
885           module
886
887       ·   similar to scan, but also wait until the device scanning process is
888           finished for a given filesystem
889
890       ·   get the supported features (can be also found under
891           /sys/fs/btrfs/features)
892
893       The device is usually created by a system device node manager (eg.
894       udev), but can be created manually:
895
896           # mknod --mode=600 c 10 234 /dev/btrfs-control
897
898       The control device is not strictly required but the device scanning
899       will not work and a workaround would need to be used to mount a
900       multi-device filesystem. The mount option device can trigger the device
901       scanning during mount.
902

SEE ALSO

904       acl(5), btrfs(8), chattr(1), fstrim(8), ioctl(2), mkfs.btrfs(8),
905       mount(8), swapon(8)
906
907
908
909Btrfs v5.4                        12/06/2019                     BTRFS-MAN5(5)
Impressum