1BTRFS-MAN5(5) Btrfs Manual BTRFS-MAN5(5)
2
3
4
6 btrfs-man5 - topics about the BTRFS filesystem (mount options,
7 supported file attributes and other)
8
10 This document describes topics related to BTRFS that are not specific
11 to the tools. Currently covers:
12
13 1. mount options
14
15 2. filesystem features
16
17 3. checksum algorithms
18
19 4. filesystem limits
20
21 5. bootloader support
22
23 6. file attributes
24
25 7. control device
26
28 This section describes mount options specific to BTRFS. For the generic
29 mount options please refer to mount(8) manpage. The options are sorted
30 alphabetically (discarding the no prefix).
31
32 Note
33 most mount options apply to the whole filesystem and only options
34 in the first mounted subvolume will take effect. This is due to
35 lack of implementation and may change in the future. This means
36 that (for example) you can’t set per-subvolume nodatacow,
37 nodatasum, or compress using mount options. This should eventually
38 be fixed, but it has proved to be difficult to implement correctly
39 within the Linux VFS framework.
40
41 acl, noacl
42 (default: on)
43
44 Enable/disable support for Posix Access Control Lists (ACLs). See
45 the acl(5) manual page for more information about ACLs.
46
47 The support for ACL is build-time configurable (BTRFS_FS_POSIX_ACL)
48 and mount fails if acl is requested but the feature is not compiled
49 in.
50
51 autodefrag, noautodefrag
52 (since: 3.0, default: off)
53
54 Enable automatic file defragmentation. When enabled, small random
55 writes into files (in a range of tens of kilobytes, currently it’s
56 64K) are detected and queued up for the defragmentation process.
57 Not well suited for large database workloads.
58
59 The read latency may increase due to reading the adjacent blocks
60 that make up the range for defragmentation, successive write will
61 merge the blocks in the new location.
62
63 Warning
64 Defragmenting with Linux kernel versions < 3.9 or ≥ 3.14-rc2 as
65 well as with Linux stable kernel versions ≥ 3.10.31, ≥ 3.12.12
66 or ≥ 3.13.4 will break up the reflinks of COW data (for example
67 files copied with cp --reflink, snapshots or de-duplicated
68 data). This may cause considerable increase of space usage
69 depending on the broken up reflinks.
70
71 barrier, nobarrier
72 (default: on)
73
74 Ensure that all IO write operations make it through the device
75 cache and are stored permanently when the filesystem is at its
76 consistency checkpoint. This typically means that a flush command
77 is sent to the device that will synchronize all pending data and
78 ordinary metadata blocks, then writes the superblock and issues
79 another flush.
80
81 The write flushes incur a slight hit and also prevent the IO block
82 scheduler to reorder requests in a more effective way. Disabling
83 barriers gets rid of that penalty but will most certainly lead to a
84 corrupted filesystem in case of a crash or power loss. The ordinary
85 metadata blocks could be yet unwritten at the time the new
86 superblock is stored permanently, expecting that the block pointers
87 to metadata were stored permanently before.
88
89 On a device with a volatile battery-backed write-back cache, the
90 nobarrier option will not lead to filesystem corruption as the
91 pending blocks are supposed to make it to the permanent storage.
92
93 check_int, check_int_data, check_int_print_mask=value
94 (since: 3.0, default: off)
95
96 These debugging options control the behavior of the integrity
97 checking module (the BTRFS_FS_CHECK_INTEGRITY config option
98 required). The main goal is to verify that all blocks from a given
99 transaction period are properly linked.
100
101 check_int enables the integrity checker module, which examines all
102 block write requests to ensure on-disk consistency, at a large
103 memory and CPU cost.
104
105 check_int_data includes extent data in the integrity checks, and
106 implies the check_int option.
107
108 check_int_print_mask takes a bitmask of BTRFSIC_PRINT_MASK_* values
109 as defined in fs/btrfs/check-integrity.c, to control the integrity
110 checker module behavior.
111
112 See comments at the top of fs/btrfs/check-integrity.c for more
113 information.
114
115 clear_cache
116 Force clearing and rebuilding of the disk space cache if something
117 has gone wrong. See also: space_cache.
118
119 commit=seconds
120 (since: 3.12, default: 30)
121
122 Set the interval of periodic transaction commit when data are
123 synchronized to permanent storage. Higher interval values lead to
124 larger amount of unwritten data, which has obvious consequences
125 when the system crashes. The upper bound is not forced, but a
126 warning is printed if it’s more than 300 seconds (5 minutes). Use
127 with care.
128
129 compress, compress=type[:level], compress-force,
130 compress-force=type[:level]
131 (default: off, level support since: 5.1)
132
133 Control BTRFS file data compression. Type may be specified as zlib,
134 lzo, zstd or no (for no compression, used for remounting). If no
135 type is specified, zlib is used. If compress-force is specified,
136 then compression will always be attempted, but the data may end up
137 uncompressed if the compression would make them larger.
138
139 Both zlib and zstd (since version 5.1) expose the compression level
140 as a tunable knob with higher levels trading speed and memory
141 (zstd) for higher compression ratios. This can be set by appending
142 a colon and the desired level. Zlib accepts the range [1, 9] and
143 zstd accepts [1, 15]. If no level is set, both currently use a
144 default level of 3. The value 0 is an alias for the defaul level.
145
146 Otherwise some simple heuristics are applied to detect an
147 incompressible file. If the first blocks written to a file are not
148 compressible, the whole file is permanently marked to skip
149 compression. As this is too simple, the compress-force is a
150 workaround that will compress most of the files at the cost of some
151 wasted CPU cycles on failed attempts. Since kernel 4.15, a set of
152 heuristic algorithms have been improved by using frequency
153 sampling, repeated pattern detection and Shannon entropy
154 calculation to avoid that.
155
156 Note
157 If compression is enabled, nodatacow and nodatasum are
158 disabled.
159
160 datacow, nodatacow
161 (default: on)
162
163 Enable data copy-on-write for newly created files. Nodatacow
164 implies nodatasum, and disables compression. All files created
165 under nodatacow are also set the NOCOW file attribute (see
166 chattr(1)).
167
168 Note
169 If nodatacow or nodatasum are enabled, compression is disabled.
170 Updates in-place improve performance for workloads that do frequent
171 overwrites, at the cost of potential partial writes, in case the
172 write is interrupted (system crash, device failure).
173
174 datasum, nodatasum
175 (default: on)
176
177 Enable data checksumming for newly created files. Datasum implies
178 datacow, ie. the normal mode of operation. All files created under
179 nodatasum inherit the "no checksums" property, however there’s no
180 corresponding file attribute (see chattr(1)).
181
182 Note
183 If nodatacow or nodatasum are enabled, compression is disabled.
184 There is a slight performance gain when checksums are turned off,
185 the corresponding metadata blocks holding the checksums do not need
186 to updated. The cost of checksumming of the blocks in memory is
187 much lower than the IO, modern CPUs feature hardware support of the
188 checksumming algorithm.
189
190 degraded
191 (default: off)
192
193 Allow mounts with less devices than the RAID profile constraints
194 require. A read-write mount (or remount) may fail when there are
195 too many devices missing, for example if a stripe member is
196 completely missing from RAID0.
197
198 Since 4.14, the constraint checks have been improved and are
199 verified on the chunk level, not an the device level. This allows
200 degraded mounts of filesystems with mixed RAID profiles for data
201 and metadata, even if the device number constraints would not be
202 satisfied for some of the profiles.
203
204 Example: metadata — raid1, data — single, devices — /dev/sda,
205 /dev/sdb
206
207 Suppose the data are completely stored on sda, then missing sdb
208 will not prevent the mount, even if 1 missing device would normally
209 prevent (any) single profile to mount. In case some of the data
210 chunks are stored on sdb, then the constraint of single/data is not
211 satisfied and the filesystem cannot be mounted.
212
213 device=devicepath
214 Specify a path to a device that will be scanned for BTRFS
215 filesystem during mount. This is usually done automatically by a
216 device manager (like udev) or using the btrfs device scan command
217 (eg. run from the initial ramdisk). In cases where this is not
218 possible the device mount option can help.
219
220 Note
221 booting eg. a RAID1 system may fail even if all filesystem’s
222 device paths are provided as the actual device nodes may not be
223 discovered by the system at that point.
224
225 discard, nodiscard
226 (default: off)
227
228 Enable discarding of freed file blocks. This is useful for SSD
229 devices, thinly provisioned LUNs, or virtual machine images;
230 however, every storage layer must support discard for it to work.
231 if the backing device does not support asynchronous queued TRIM,
232 then this operation can severely degrade performance, because a
233 synchronous TRIM operation will be attempted instead. Queued TRIM
234 requires newer than SATA revision 3.1 chipsets and devices.
235
236 If it is not necessary to immediately discard freed blocks, then
237 the fstrim tool can be used to discard all free blocks in a batch.
238 Scheduling a TRIM during a period of low system activity will
239 prevent latent interference with the performance of other
240 operations. Also, a device may ignore the TRIM command if the range
241 is too small, so running a batch discard has a greater probability
242 of actually discarding the blocks.
243
244 enospc_debug, noenospc_debug
245 (default: off)
246
247 Enable verbose output for some ENOSPC conditions. It’s safe to use
248 but can be noisy if the system reaches near-full state.
249
250 fatal_errors=action
251 (since: 3.4, default: bug)
252
253 Action to take when encountering a fatal error.
254
255 bug
256 BUG() on a fatal error, the system will stay in the crashed
257 state and may be still partially usable, but reboot is required
258 for full operation
259
260 panic
261 panic() on a fatal error, depending on other system
262 configuration, this may be followed by a reboot. Please refer
263 to the documentation of kernel boot parameters, eg. panic,
264 oops or crashkernel.
265
266 flushoncommit, noflushoncommit
267 (default: off)
268
269 This option forces any data dirtied by a write in a prior
270 transaction to commit as part of the current commit, effectively a
271 full filesystem sync.
272
273 This makes the committed state a fully consistent view of the file
274 system from the application’s perspective (i.e. it includes all
275 completed file system operations). This was previously the behavior
276 only when a snapshot was created.
277
278 When off, the filesystem is consistent but buffered writes may last
279 more than one transaction commit.
280
281 fragment=type
282 (depends on compile-time option BTRFS_DEBUG, since: 4.4, default:
283 off)
284
285 A debugging helper to intentionally fragment given type of block
286 groups. The type can be data, metadata or all. This mount option
287 should not be used outside of debugging environments and is not
288 recognized if the kernel config option BTRFS_DEBUG is not enabled.
289
290 inode_cache, noinode_cache
291 (since: 3.0, default: off)
292
293 Enable free inode number caching. Not recommended to use unless
294 files on your filesystem get assigned inode numbers that are
295 approaching 2^64. Normally, new files in each subvolume get
296 assigned incrementally (plus one from the last time) and are not
297 reused. The mount option turns on caching of the existing inode
298 numbers and reuse of inode numbers of deleted files.
299
300 This option may slow down your system at first run, or after
301 mounting without the option.
302
303 Note
304 Defaults to off due to a potential overflow problem when the
305 free space checksums don’t fit inside a single page.
306 Don’t use this option unless you really need it. The inode number
307 limit on 64bit system is 2^64, which is practically enough for the
308 whole filesystem lifetime. Due to implementation of linux VFS
309 layer, the inode numbers on 32bit systems are only 32 bits wide.
310 This lowers the limit significantly and makes it possible to reach
311 it. In such case, this mount option will help. Alternatively, files
312 with high inode numbers can be copied to a new subvolume which will
313 effectively start the inode numbers from the beginning again.
314
315 logreplay, nologreplay
316 (default: on, even read-only)
317
318 Enable/disable log replay at mount time. See also treelog. Note
319 that nologreplay is the same as norecovery.
320
321 Warning
322 currently, the tree log is replayed even with a read-only
323 mount! To disable that behaviour, mount also with nologreplay.
324
325 max_inline=bytes
326 (default: min(2048, page size) )
327
328 Specify the maximum amount of space, that can be inlined in a
329 metadata B-tree leaf. The value is specified in bytes, optionally
330 with a K suffix (case insensitive). In practice, this value is
331 limited by the filesystem block size (named sectorsize at mkfs
332 time), and memory page size of the system. In case of sectorsize
333 limit, there’s some space unavailable due to leaf headers. For
334 example, a 4k sectorsize, maximum size of inline data is about 3900
335 bytes.
336
337 Inlining can be completely turned off by specifying 0. This will
338 increase data block slack if file sizes are much smaller than block
339 size but will reduce metadata consumption in return.
340
341 Note
342 the default value has changed to 2048 in kernel 4.6.
343
344 metadata_ratio=value
345 (default: 0, internal logic)
346
347 Specifies that 1 metadata chunk should be allocated after every
348 value data chunks. Default behaviour depends on internal logic,
349 some percent of unused metadata space is attempted to be maintained
350 but is not always possible if there’s not enough space left for
351 chunk allocation. The option could be useful to override the
352 internal logic in favor of the metadata allocation if the expected
353 workload is supposed to be metadata intense (snapshots, reflinks,
354 xattrs, inlined files).
355
356 norecovery
357 (since: 4.5, default: off)
358
359 Do not attempt any data recovery at mount time. This will disable
360 logreplay and avoids other write operations. Note that this option
361 is the same as nologreplay.
362
363 Note
364 The opposite option recovery used to have different meaning but
365 was changed for consistency with other filesystems, where
366 norecovery is used for skipping log replay. BTRFS does the same
367 and in general will try to avoid any write operations.
368
369 rescan_uuid_tree
370 (since: 3.12, default: off)
371
372 Force check and rebuild procedure of the UUID tree. This should not
373 normally be needed.
374
375 skip_balance
376 (since: 3.3, default: off)
377
378 Skip automatic resume of an interrupted balance operation. The
379 operation can later be resumed with btrfs balance resume, or the
380 paused state can be removed with btrfs balance cancel. The default
381 behaviour is to resume an interrupted balance immediately after a
382 volume is mounted.
383
384 space_cache, space_cache=version, nospace_cache
385 (nospace_cache since: 3.2, space_cache=v1 and space_cache=v2 since
386 4.5, default: space_cache=v1)
387
388 Options to control the free space cache. The free space cache
389 greatly improves performance when reading block group free space
390 into memory. However, managing the space cache consumes some
391 resources, including a small amount of disk space.
392
393 There are two implementations of the free space cache. The original
394 one, referred to as v1, is the safe default. The v1 space cache can
395 be disabled at mount time with nospace_cache without clearing.
396
397 On very large filesystems (many terabytes) and certain workloads,
398 the performance of the v1 space cache may degrade drastically. The
399 v2 implementation, which adds a new B-tree called the free space
400 tree, addresses this issue. Once enabled, the v2 space cache will
401 always be used and cannot be disabled unless it is cleared. Use
402 clear_cache,space_cache=v1 or clear_cache,nospace_cache to do so.
403 If v2 is enabled, kernels without v2 support will only be able to
404 mount the filesystem in read-only mode. The btrfs(8) command
405 currently only has read-only support for v2. A read-write command
406 may be run on a v2 filesystem by clearing the cache, running the
407 command, and then remounting with space_cache=v2.
408
409 If a version is not explicitly specified, the default
410 implementation will be chosen, which is v1.
411
412 ssd, ssd_spread, nossd, nossd_spread
413 (default: SSD autodetected)
414
415 Options to control SSD allocation schemes. By default, BTRFS will
416 enable or disable SSD optimizations depending on status of a device
417 with respect to rotational or non-rotational type. This is
418 determined by the contents of /sys/block/DEV/queue/rotational). If
419 it is 0, the ssd option is turned on. The option nossd will disable
420 the autodetection.
421
422 The optimizations make use of the absence of the seek penalty
423 that’s inherent for the rotational devices. The blocks can be
424 typically written faster and are not offloaded to separate threads.
425
426 Note
427 Since 4.14, the block layout optimizations have been dropped.
428 This used to help with first generations of SSD devices. Their
429 FTL (flash translation layer) was not effective and the
430 optimization was supposed to improve the wear by better
431 aligning blocks. This is no longer true with modern SSD devices
432 and the optimization had no real benefit. Furthermore it caused
433 increased fragmentation. The layout tuning has been kept intact
434 for the option ssd_spread.
435 The ssd_spread mount option attempts to allocate into bigger and
436 aligned chunks of unused space, and may perform better on low-end
437 SSDs. ssd_spread implies ssd, enabling all other SSD heuristics as
438 well. The option nossd will disable all SSD options while
439 nossd_spread only disables ssd_spread.
440
441 subvol=path
442 Mount subvolume from path rather than the toplevel subvolume. The
443 path is always treated as relative to the toplevel subvolume. This
444 mount option overrides the default subvolume set for the given
445 filesystem.
446
447 subvolid=subvolid
448 Mount subvolume specified by a subvolid number rather than the
449 toplevel subvolume. You can use btrfs subvolume list of btrfs
450 subvolume show to see subvolume ID numbers. This mount option
451 overrides the default subvolume set for the given filesystem.
452
453 Note
454 if both subvolid and subvol are specified, they must point at
455 the same subvolume, otherwise the mount will fail.
456
457 thread_pool=number
458 (default: min(NRCPUS + 2, 8) )
459
460 The number of worker threads to start. NRCPUS is number of on-line
461 CPUs detected at the time of mount. Small number leads to less
462 parallelism in processing data and metadata, higher numbers could
463 lead to a performance hit due to increased locking contention,
464 process scheduling, cache-line bouncing or costly data transfers
465 between local CPU memories.
466
467 treelog, notreelog
468 (default: on)
469
470 Enable the tree logging used for fsync and O_SYNC writes. The tree
471 log stores changes without the need of a full filesystem sync. The
472 log operations are flushed at sync and transaction commit. If the
473 system crashes between two such syncs, the pending tree log
474 operations are replayed during mount.
475
476 Warning
477 currently, the tree log is replayed even with a read-only
478 mount! To disable that behaviour, also mount with nologreplay.
479 The tree log could contain new files/directories, these would not
480 exist on a mounted filesystem if the log is not replayed.
481
482 usebackuproot, nousebackuproot
483 (since: 4.6, default: off)
484
485 Enable autorecovery attempts if a bad tree root is found at mount
486 time. Currently this scans a backup list of several previous tree
487 roots and tries to use the first readable. This can be used with
488 read-only mounts as well.
489
490 Note
491 This option has replaced recovery.
492
493 user_subvol_rm_allowed
494 (default: off)
495
496 Allow subvolumes to be deleted by their respective owner.
497 Otherwise, only the root user can do that.
498
499 Note
500 historically, any user could create a snapshot even if he was
501 not owner of the source subvolume, the subvolume deletion has
502 been restricted for that reason. The subvolume creation has
503 been restricted but this mount option is still required. This
504 is a usability issue. Since 4.18, the rmdir(2) syscall can
505 delete an empty subvolume just like an ordinary directory.
506 Whether this is possible can be detected at runtime, see
507 rmdir_subvol feature in FILESYSTEM FEATURES.
508
509 DEPRECATED MOUNT OPTIONS
510 List of mount options that have been removed, kept for backward
511 compatibility.
512
513 alloc_start=bytes
514 (default: 1M, minimum: 1M, deprecated since: 4.13)
515
516 Debugging option to force all block allocations above a certain
517 byte threshold on each block device. The value is specified in
518 bytes, optionally with a K, M, or G suffix (case insensitive).
519
520 recovery
521 (since: 3.2, default: off, deprecated since: 4.5)
522
523 Note
524 this option has been replaced by usebackuproot and should not
525 be used but will work on 4.5+ kernels.
526
527 subvolrootid=objectid
528 (irrelevant since: 3.2, formally deprecated since: 3.10)
529
530 A workaround option from times (pre 3.2) when it was not possible
531 to mount a subvolume that did not reside directly under the
532 toplevel subvolume.
533
534 NOTES ON GENERIC MOUNT OPTIONS
535 Some of the general mount options from mount(8) that affect BTRFS and
536 are worth mentioning.
537
538 noatime
539 under read intensive work-loads, specifying noatime significantly
540 improves performance because no new access time information needs
541 to be written. Without this option, the default is relatime, which
542 only reduces the number of inode atime updates in comparison to the
543 traditional strictatime. The worst case for atime updates under
544 relatime occurs when many files are read whose atime is older than
545 24 h and which are freshly snapshotted. In that case the atime is
546 updated and COW happens - for each file - in bulk. See also
547 https://lwn.net/Articles/499293/ - Atime and btrfs: a bad
548 combination? (LWN, 2012-05-31).
549
550 Note that noatime may break applications that rely on atime uptimes
551 like the venerable Mutt (unless you use maildir mailboxes).
552
554 The basic set of filesystem features gets extended over time. The
555 backward compatibility is maintained and the features are optional,
556 need to be explicitly asked for so accidental use will not create
557 incompatibilities.
558
559 There are several classes and the respective tools to manage the
560 features:
561
562 at mkfs time only
563 This is namely for core structures, like the b-tree nodesize or
564 checksum algorithm, see mkfs.btrfs(8) for more details.
565
566 after mkfs, on an unmounted filesystem
567 Features that may optimize internal structures or add new
568 structures to support new functionality, see btrfstune(8). The
569 command btrfs inspect-internal dump-super device will dump a
570 superblock, you can map the value of incompat_flags to the features
571 listed below
572
573 after mkfs, on a mounted filesystem
574 The features of a filesystem (with a given UUID) are listed in
575 /sys/fs/btrfs/UUID/features/, one file per feature. The status is
576 stored inside the file. The value 1 is for enabled and active,
577 while 0 means the feature was enabled at mount time but turned off
578 afterwards.
579
580 Whether a particular feature can be turned on a mounted filesystem
581 can be found in the directory /sys/fs/btrfs/features/, one file per
582 feature. The value 1 means the feature can be enabled.
583
584 List of features (see also mkfs.btrfs(8) section FILESYSTEM FEATURES):
585
586 big_metadata
587 (since: 3.4)
588
589 the filesystem uses nodesize for metadata blocks, this can be
590 bigger than the page size
591
592 compress_lzo
593 (since: 2.6.38)
594
595 the lzo compression has been used on the filesystem, either as a
596 mount option or via btrfs filesystem defrag.
597
598 compress_zstd
599 (since: 4.14)
600
601 the zstd compression has been used on the filesystem, either as a
602 mount option or via btrfs filesystem defrag.
603
604 default_subvol
605 (since: 2.6.34)
606
607 the default subvolume has been set on the filesystem
608
609 extended_iref
610 (since: 3.7)
611
612 increased hardlink limit per file in a directory to 65536, older
613 kernels supported a varying number of hardlinks depending on the
614 sum of all file name sizes that can be stored into one metadata
615 block
616
617 metadata_uuid
618 (since: 5.0)
619
620 the main filesystem UUID is the metadata_uuid, which stores the new
621 UUID only in the superblock while all metadata blocks still have
622 the UUID set at mkfs time, see btrfstune(8) for more
623
624 mixed_backref
625 (since: 2.6.31)
626
627 the last major disk format change, improved backreferences, now
628 default
629
630 mixed_groups
631 (since: 2.6.37)
632
633 mixed data and metadata block groups, ie. the data and metadata are
634 not separated and occupy the same block groups, this mode is
635 suitable for small volumes as there are no constraints how the
636 remaining space should be used (compared to the split mode, where
637 empty metadata space cannot be used for data and vice versa)
638
639 on the other hand, the final layout is quite unpredictable and
640 possibly highly fragmented, which means worse performance
641
642 no_holes
643 (since: 3.14)
644
645 improved representation of file extents where holes are not
646 explicitly stored as an extent, saves a few percent of metadata if
647 sparse files are used
648
649 raid56
650 (since: 3.9)
651
652 the filesystem contains or contained a raid56 profile of block
653 groups
654
655 rmdir_subvol
656 (since: 4.18)
657
658 indicate that rmdir(2) syscall can delete an empty subvolume just
659 like an ordinary directory. Note that this feature only depends on
660 the kernel version.
661
662 skinny_metadata
663 (since: 3.10)
664
665 reduced-size metadata for extent references, saves a few percent of
666 metadata
667
668 SWAPFILE SUPPORT
669 The swapfile is supported since kernel 5.0. Use swapon(8) to activate
670 the swapfile. There are some limitations of the implementation in btrfs
671 and linux swap subystem:
672
673 · filesystem - must be only single device
674
675 · swapfile - the containing subvolume cannot be snapshotted
676
677 · swapfile - must be preallocated
678
679 · swapfile - must be nodatacow (ie. also nodatasum)
680
681 · swapfile - must not be compressed
682
683 The limitations come namely from the COW-based design and mapping layer
684 of blocks that allows the advanced features like relocation and
685 multi-device filesystems. However, the swap subsystem expects simpler
686 mapping and no background changes of the file blocks once they’ve been
687 attached to swap.
688
689 With active swapfiles, the following whole-filesystem operations will
690 skip swapfile extents or may fail:
691
692 · balance - block groups with swapfile extents are skipped and
693 reported, the rest will be processed normally
694
695 · resize grow - unaffected
696
697 · resize shrink - works as long as the extents are outside of the
698 shrunk range
699
700 · device add - a new device does not interfere with existing swapfile
701 and this operation will work, though no new swapfile can be
702 activated afterwards
703
704 · device delete - if the device has been added as above, it can be
705 also deleted
706
707 · device replace - ditto
708
709 When there are no active swapfiles and a whole-filesystem exclusive
710 operation is running (ie. balance, device delete, shrink), the
711 swapfiles cannot be temporarily activated. The operation must finish
712 first.
713
714 # truncate -s 0 swapfile
715 # chattr +C swapfile
716 # fallocate -l 2G swapfile
717 # chmod 0600 swapfile
718 # mkswap swapfile
719 # swapon swapfile
720
722 There are several checksum algorithms supported. The default and
723 backward compatible is crc32c. Since kernel 5.5 there are three more
724 with different characteristics and trade-offs regarding speed and
725 strength. The following list may help you to decide which one to
726 select.
727
728 CRC32C (32bit digest)
729 default, best backward compatibility, very fast, modern CPUs have
730 instruction-level support, not collision-resistant but still good
731 error detection capabilities
732
733 XXHASH (64bit digest)
734 can be used as CRC32C successor, very fast, optimized for modern
735 CPUs utilizing instruction pipelining, good collision resistance
736 and error detection
737
738 SHA256 (256bit digest)
739 a cryptographic-strength hash, relatively slow but with possible
740 CPU instruction acceleration or specialized hardware cards, FIPS
741 certified and in wide use
742
743 BLAKE2b (256bit digest)
744 a cryptographic-strength hash, relatively fast with possible CPU
745 acceleration using SIMD extensions, not standardized but based on
746 BLAKE which was a SHA3 finalist, in wide use, the algorithm used is
747 BLAKE2b-256 that’s optimized for 64bit platforms
748
749 The digest size affects overall size of data block checksums stored in
750 the filesystem. The metadata blocks have a fixed area up to 256bits (32
751 bytes), so there’s no increase. Each data block has a separate checksum
752 stored, with additional overhead of the b-tree leaves.
753
754 Approximate relative performance of the algorithms, measured against
755 CRC32C using reference software implementations on a 3.5GHz intel CPU:
756
757 ┌────────┬─────────────┬───────┐
758 │ │ │ │
759 │Digest │ Cycles/4KiB │ Ratio │
760 ├────────┼─────────────┼───────┤
761 │ │ │ │
762 │CRC32C │ 1700 │ 1.00 │
763 ├────────┼─────────────┼───────┤
764 │ │ │ │
765 │XXHASH │ 2500 │ 1.44 │
766 ├────────┼─────────────┼───────┤
767 │ │ │ │
768 │SHA256 │ 105000 │ 61 │
769 ├────────┼─────────────┼───────┤
770 │ │ │ │
771 │BLAKE2b │ 22000 │ 13 │
772 └────────┴─────────────┴───────┘
773
775 maximum file name length
776 255
777
778 maximum symlink target length
779 depends on the nodesize value, for 4k it’s 3949 bytes, for larger
780 nodesize it’s 4095 due to the system limit PATH_MAX
781
782 The symlink target may not be a valid path, ie. the path name
783 components can exceed the limits (NAME_MAX), there’s no content
784 validation at symlink(3) creation.
785
786 maximum number of inodes
787 2^64 but depends on the available metadata space as the inodes are
788 created dynamically
789
790 inode numbers
791 minimum number: 256 (for subvolumes), regular files and
792 directories: 257
793
794 maximum file length
795 inherent limit of btrfs is 2^64 (16 EiB) but the linux VFS limit is
796 2^63 (8 EiB)
797
798 maximum number of subvolumes
799 the subvolume ids can go up to 2^64 but the number of actual
800 subvolumes depends on the available metadata space, the space
801 consumed by all subvolume metadata includes bookkeeping of shared
802 extents can be large (MiB, GiB)
803
804 maximum number of hardlinks of a file in a directory
805 65536 when the extref feature is turned on during mkfs (default),
806 roughly 100 otherwise
807
809 GRUB2 (https://www.gnu.org/software/grub) has the most advanced support
810 of booting from BTRFS with respect to features.
811
812 EXTLINUX (from the https://syslinux.org project) can boot but does not
813 support all features. Please check the upstream documentation before
814 you use it.
815
817 The btrfs filesystem supports setting the following file attributes
818 using the chattr(1) utility:
819
820 a
821 append only, new writes are always written at the end of the file
822
823 A
824 no atime updates
825
826 c
827 compress data, all data written after this attribute is set will be
828 compressed. Please note that compression is also affected by the
829 mount options or the parent directory attributes.
830
831 When set on a directory, all newly created files will inherit this
832 attribute.
833
834 C
835 no copy-on-write, file modifications are done in-place
836
837 When set on a directory, all newly created files will inherit this
838 attribute.
839
840 Note
841 due to implementation limitations, this flag can be set/unset
842 only on empty files.
843
844 d
845 no dump, makes sense with 3rd party tools like dump(8), on BTRFS
846 the attribute can be set/unset but no other special handling is
847 done
848
849 D
850 synchronous directory updates, for more details search open(2) for
851 O_SYNC and O_DSYNC
852
853 i
854 immutable, no file data and metadata changes allowed even to the
855 root user as long as this attribute is set (obviously the exception
856 is unsetting the attribute)
857
858 S
859 synchronous updates, for more details search open(2) for O_SYNC and
860 O_DSYNC
861
862 X
863 no compression, permanently turn off compression on the given file.
864 Any compression mount options will not affect this file.
865
866 When set on a directory, all newly created files will inherit this
867 attribute.
868
869 No other attributes are supported. For the complete list please refer
870 to the chattr(1) manual page.
871
873 There’s a character special device /dev/btrfs-control with major and
874 minor numbers 10 and 234 (the device can be found under the misc
875 category).
876
877 $ ls -l /dev/btrfs-control
878 crw------- 1 root root 10, 234 Jan 1 12:00 /dev/btrfs-control
879
880 The device accepts some ioctl calls that can perform following actions
881 on the filesystem module:
882
883 · scan devices for btrfs filesystem (ie. to let multi-device
884 filesystems mount automatically) and register them with the kernel
885 module
886
887 · similar to scan, but also wait until the device scanning process is
888 finished for a given filesystem
889
890 · get the supported features (can be also found under
891 /sys/fs/btrfs/features)
892
893 The device is usually created by a system device node manager (eg.
894 udev), but can be created manually:
895
896 # mknod --mode=600 c 10 234 /dev/btrfs-control
897
898 The control device is not strictly required but the device scanning
899 will not work and a workaround would need to be used to mount a
900 multi-device filesystem. The mount option device can trigger the device
901 scanning during mount.
902
904 acl(5), btrfs(8), chattr(1), fstrim(8), ioctl(2), mkfs.btrfs(8),
905 mount(8), swapon(8)
906
907
908
909Btrfs v5.4 12/06/2019 BTRFS-MAN5(5)