1BTRFS-DEVICE(8) Btrfs Manual BTRFS-DEVICE(8)
2
3
4
6 btrfs-device - manage devices of btrfs filesystems
7
9 btrfs device <subcommand> <args>
10
12 The btrfs device command group is used to manage devices of the btrfs
13 filesystems.
14
16 Btrfs filesystem can be created on top of single or multiple block
17 devices. Data and metadata are organized in allocation profiles with
18 various redundancy policies. There’s some similarity with traditional
19 RAID levels, but this could be confusing to users familiar with the
20 traditional meaning. Due to the similarity, the RAID terminology is
21 widely used in the documentation. See mkfs.btrfs(8) for more details
22 and the exact profile capabilities and constraints.
23
24 The device management works on a mounted filesystem. Devices can be
25 added, removed or replaced, by commands provided by btrfs device and
26 btrfs replace.
27
28 The profiles can be also changed, provided there’s enough workspace to
29 do the conversion, using the btrfs balance command and namely the
30 filter convert.
31
32 Type
33 The block group profile type is the main distinction of the
34 information stored on the block device. User data are called Data,
35 the internal data structures managed by filesystem are Metadata and
36 System.
37
38 Profile
39 A profile describes an allocation policy based on the
40 redundancy/replication constraints in connection with the number of
41 devices. The profile applies to data and metadata block groups
42 separately. Eg. single, RAID1.
43
44 RAID level
45 Where applicable, the level refers to a profile that matches
46 constraints of the standard RAID levels. At the moment the
47 supported ones are: RAID0, RAID1, RAID10, RAID5 and RAID6.
48
49 See the section TYPICAL USECASES for some examples.
50
52 add [-Kf] <device> [<device>...] <path>
53 Add device(s) to the filesystem identified by <path>.
54
55 If applicable, a whole device discard (TRIM) operation is performed
56 prior to adding the device. A device with existing filesystem
57 detected by blkid(8) will prevent device addition and has to be
58 forced. Alternatively the filesystem can be wiped from the device
59 using eg. the wipefs(8) tool.
60
61 The operation is instant and does not affect existing data. The
62 operation merely adds the device to the filesystem structures and
63 creates some block groups headers.
64
65 Options
66
67 -K|--nodiscard
68 do not perform discard (TRIM) by default
69
70 -f|--force
71 force overwrite of existing filesystem on the given disk(s)
72
73 --enqueue
74 wait if there’s another exclusive operation running, otherwise
75 continue
76
77 remove [options] <device>|<devid> [<device>|<devid>...] <path>
78 Remove device(s) from a filesystem identified by <path>
79
80 Device removal must satisfy the profile constraints, otherwise the
81 command fails. The filesystem must be converted to profile(s) that
82 would allow the removal. This can typically happen when going down
83 from 2 devices to 1 and using the RAID1 profile. See the TYPICAL
84 USECASES section below.
85
86 The operation can take long as it needs to move all data from the
87 device.
88
89 It is possible to delete the device that was used to mount the
90 filesystem. The device entry in the mount table will be replaced by
91 another device name with the lowest device id.
92
93 If the filesystem is mounted in degraded mode (-o degraded),
94 special term missing can be used for device. In that case, the
95 first device that is described by the filesystem metadata, but not
96 present at the mount time will be removed.
97
98 Note
99 In most cases, there is only one missing device in degraded
100 mode, otherwise mount fails. If there are two or more devices
101 missing (e.g. possible in RAID6), you need specify missing as
102 many times as the number of missing devices to remove all of
103 them.
104 Options
105
106 --enqueue
107 wait if there’s another exclusive operation running, otherwise
108 continue
109
110 delete <device>|<devid> [<device>|<devid>...] <path>
111 Alias of remove kept for backward compatibility
112
113 ready <device>
114 Wait until all devices of a multiple-device filesystem are scanned
115 and registered within the kernel module. This is to provide a way
116 for automatic filesystem mounting tools to wait before the mount
117 can start. The device scan is only one of the preconditions and the
118 mount can fail for other reasons. Normal users usually do not need
119 this command and may safely ignore it.
120
121 scan [options] [<device> [<device>...]]
122 Scan devices for a btrfs filesystem and register them with the
123 kernel module. This allows mounting multiple-device filesystem by
124 specifying just one from the whole group.
125
126 If no devices are passed, all block devices that blkid reports to
127 contain btrfs are scanned.
128
129 The options --all-devices or -d can be used as a fallback in case
130 blkid is not available. If used, behavior is the same as if no
131 devices are passed.
132
133 The command can be run repeatedly. Devices that have been already
134 registered remain as such. Reloading the kernel module will drop
135 this information. There’s an alternative way of mounting
136 multiple-device filesystem without the need for prior scanning. See
137 the mount option device.
138
139 Options
140
141 -d|--all-devices
142 Enumerate and register all devices, use as a fallback in case
143 blkid is not available.
144
145 -u|--forget
146 Unregister a given device or all stale devices if no path is
147 given, the device must be unmounted otherwise it’s an error.
148
149 stats [options] <path>|<device>
150 Read and print the device IO error statistics for all devices of
151 the given filesystem identified by <path> or for a single <device>.
152 The filesystem must be mounted. See section DEVICE STATS for more
153 information about the reported statistics and the meaning.
154
155 Options
156
157 -z|--reset
158 Print the stats and reset the values to zero afterwards.
159
160 -c|--check
161 Check if the stats are all zeros and return 0 if it is so. Set
162 bit 6 of the return code if any of the statistics is no-zero.
163 The error values is 65 if reading stats from at least one
164 device failed, otherwise it’s 64.
165
166 usage [options] <path> [<path>...]
167 Show detailed information about internal allocations on devices.
168
169 The level of detail can differ if the command is run under a
170 regular or the root user (due to use of restricted ioctls). The
171 first example below is for normal user (warning included) and the
172 next one with root on the same filesystem:
173
174 WARNING: cannot read detailed chunk info, per-device usage will not be shown, run as root
175 /dev/sdc1, ID: 1
176 Device size: 931.51GiB
177 Device slack: 0.00B
178 Unallocated: 931.51GiB
179
180 /dev/sdc1, ID: 1
181 Device size: 931.51GiB
182 Device slack: 0.00B
183 Data,single: 641.00GiB
184 Data,RAID0/3: 1.00GiB
185 Metadata,single: 19.00GiB
186 System,single: 32.00MiB
187 Unallocated: 271.48GiB
188
189 • Device size — size of the device as seen by the filesystem (may
190 be different than actual device size)
191
192 • Device slack — portion of device not used by the filesystem but
193 still available in the physical space provided by the device,
194 eg. after a device shrink
195
196 • Data,single, Metadata,single, System,single — in general, list
197 of block group type (Data, Metadata, System) and profile
198 (single, RAID1, ...) allocated on the device
199
200 • Data,RAID0/3 — in particular, striped profiles
201 RAID0/RAID10/RAID5/RAID6 with the number of devices on which
202 the stripes are allocated, multiple occurences of the same
203 profile can appear in case a new device has been added and all
204 new available stripes have been used for writes
205
206 • Unallocated — remaining space that the filesystem can still use
207 for new block groups
208
209 Options
210
211 -b|--raw
212 raw numbers in bytes, without the B suffix
213
214 -h|--human-readable
215 print human friendly numbers, base 1024, this is the default
216
217 -H
218 print human friendly numbers, base 1000
219
220 --iec
221 select the 1024 base for the following options, according to
222 the IEC standard
223
224 --si
225 select the 1000 base for the following options, according to
226 the SI standard
227
228 -k|--kbytes
229 show sizes in KiB, or kB with --si
230
231 -m|--mbytes
232 show sizes in MiB, or MB with --si
233
234 -g|--gbytes
235 show sizes in GiB, or GB with --si
236
237 -t|--tbytes
238 show sizes in TiB, or TB with --si
239
240 If conflicting options are passed, the last one takes precedence.
241
243 STARTING WITH A SINGLE-DEVICE FILESYSTEM
244 Assume we’ve created a filesystem on a block device /dev/sda with
245 profile single/single (data/metadata), the device size is 50GiB and
246 we’ve used the whole device for the filesystem. The mount point is
247 /mnt.
248
249 The amount of data stored is 16GiB, metadata have allocated 2GiB.
250
251 ADD NEW DEVICE
252 We want to increase the total size of the filesystem and keep the
253 profiles. The size of the new device /dev/sdb is 100GiB.
254
255 $ btrfs device add /dev/sdb /mnt
256
257 The amount of free data space increases by less than 100GiB, some
258 space is allocated for metadata.
259
260 CONVERT TO RAID1
261 Now we want to increase the redundancy level of both data and
262 metadata, but we’ll do that in steps. Note, that the device sizes
263 are not equal and we’ll use that to show the capabilities of split
264 data/metadata and independent profiles.
265
266 The constraint for RAID1 gives us at most 50GiB of usable space and
267 exactly 2 copies will be stored on the devices.
268
269 First we’ll convert the metadata. As the metadata occupy less than
270 50GiB and there’s enough workspace for the conversion process, we
271 can do:
272
273 $ btrfs balance start -mconvert=raid1 /mnt
274
275 This operation can take a while, because all metadata have to be
276 moved and all block pointers updated. Depending on the physical
277 locations of the old and new blocks, the disk seeking is the key
278 factor affecting performance.
279
280 You’ll note that the system block group has been also converted to
281 RAID1, this normally happens as the system block group also holds
282 metadata (the physical to logical mappings).
283
284 What changed:
285
286 • available data space decreased by 3GiB, usable roughly (50 - 3)
287 + (100 - 3) = 144 GiB
288
289 • metadata redundancy increased
290
291 IOW, the unequal device sizes allow for combined space for data yet
292 improved redundancy for metadata. If we decide to increase
293 redundancy of data as well, we’re going to lose 50GiB of the second
294 device for obvious reasons.
295
296 $ btrfs balance start -dconvert=raid1 /mnt
297
298 The balance process needs some workspace (ie. a free device space
299 without any data or metadata block groups) so the command could
300 fail if there’s too much data or the block groups occupy the whole
301 first device.
302
303 The device size of /dev/sdb as seen by the filesystem remains
304 unchanged, but the logical space from 50-100GiB will be unused.
305
306 REMOVE DEVICE
307 Device removal must satisfy the profile constraints, otherwise the
308 command fails. For example:
309
310 $ btrfs device remove /dev/sda /mnt
311 ERROR: error removing device '/dev/sda': unable to go below two devices on raid1
312
313 In order to remove a device, you need to convert the profile in
314 this case:
315
316 $ btrfs balance start -mconvert=dup -dconvert=single /mnt
317 $ btrfs device remove /dev/sda /mnt
318
320 The device stats keep persistent record of several error classes
321 related to doing IO. The current values are printed at mount time and
322 updated during filesystem lifetime or from a scrub run.
323
324 $ btrfs device stats /dev/sda3
325 [/dev/sda3].write_io_errs 0
326 [/dev/sda3].read_io_errs 0
327 [/dev/sda3].flush_io_errs 0
328 [/dev/sda3].corruption_errs 0
329 [/dev/sda3].generation_errs 0
330
331 write_io_errs
332 Failed writes to the block devices, means that the layers beneath
333 the filesystem were not able to satisfy the write request.
334
335 read_io_errors
336 Read request analogy to write_io_errs.
337
338 flush_io_errs
339 Number of failed writes with the FLUSH flag set. The flushing is a
340 method of forcing a particular order between write requests and is
341 crucial for implementing crash consistency. In case of btrfs, all
342 the metadata blocks must be permanently stored on the block device
343 before the superblock is written.
344
345 corruption_errs
346 A block checksum mismatched or a corrupted metadata header was
347 found.
348
349 generation_errs
350 The block generation does not match the expected value (eg. stored
351 in the parent node).
352
353 Since kernel 5.14 the device stats are also available in textual form
354 in /sys/fs/btrfs/FSID/devinfo/DEVID/error_stats.
355
357 btrfs device returns a zero exit status if it succeeds. Non zero is
358 returned in case of failure.
359
360 If the -s option is used, btrfs device stats will add 64 to the exit
361 status if any of the error counters is non-zero.
362
364 btrfs is part of btrfs-progs. Please refer to the btrfs wiki
365 http://btrfs.wiki.kernel.org for further details.
366
368 mkfs.btrfs(8), btrfs-replace(8), btrfs-balance(8)
369
370
371
372Btrfs v5.15.1 11/22/2021 BTRFS-DEVICE(8)