1BTRFS-BALANCE(8) Btrfs Manual BTRFS-BALANCE(8)
2
3
4
6 btrfs-balance - balance block groups on a btrfs filesystem
7
9 btrfs balance <subcommand> <args>
10
12 The primary purpose of the balance feature is to spread block groups
13 across all devices so they match constraints defined by the respective
14 profiles. See mkfs.btrfs(8) section PROFILES for more details. The
15 scope of the balancing process can be further tuned by use of filters
16 that can select the block groups to process. Balance works only on a
17 mounted filesystem. Extent sharing is preserved and reflinks are not
18 broken. Files are not defragmented nor recompressed, file extents are
19 preserved but the physical location on devices will change.
20
21 The balance operation is cancellable by the user. The on-disk state of
22 the filesystem is always consistent so an unexpected interruption (eg.
23 system crash, reboot) does not corrupt the filesystem. The progress of
24 the balance operation is temporarily stored as an internal state and
25 will be resumed upon mount, unless the mount option skip_balance is
26 specified.
27
28 Warning
29 running balance without filters will take a lot of time as it
30 basically move data/metadata from the whol filesystem and needs to
31 update all block pointers.
32
33 The filters can be used to perform following actions:
34
35 · convert block group profiles (filter convert)
36
37 · make block group usage more compact (filter usage)
38
39 · perform actions only on a given device (filters devid, drange)
40
41 The filters can be applied to a combination of block group types (data,
42 metadata, system). Note that changing only the system type needs the
43 force option. Otherwise system gets automatically converted whenever
44 metadata profile is converted.
45
46 When metadata redundancy is reduced (eg. from RAID1 to single) the
47 force option is also required and it is noted in system log.
48
49 Note
50 the balance operation needs enough work space, ie. space that is
51 completely unused in the filesystem, otherwise this may lead to
52 ENOSPC reports. See the section ENOSPC for more details.
53
55 Note
56 The balance subcommand also exists under the btrfs filesystem
57 namespace. This still works for backward compatibility but is
58 deprecated and should not be used any more.
59
60 Note
61 A short syntax btrfs balance <path> works due to backward
62 compatibility but is deprecated and should not be used any more.
63 Use btrfs balance start command instead.
64
66 Balancing operations are very IO intensive and can also be quite CPU
67 intensive, impacting other ongoing filesystem operations. Typically
68 large amounts of data are copied from one location to another, with
69 corresponding metadata updates.
70
71 Depending upon the block group layout, it can also be seek heavy.
72 Performance on rotational devices is noticeably worse compared to SSDs
73 or fast arrays.
74
76 cancel <path>
77 cancels a running or paused balance, the command will block and
78 wait until the current blockgroup being processed completes
79
80 Since kernel 5.7 the response time of the cancellation is
81 significantly improved, on older kernels it might take a long time
82 until currently processed chunk is completely finished.
83
84 pause <path>
85 pause running balance operation, this will store the state of the
86 balance progress and used filters to the filesystem
87
88 resume <path>
89 resume interrupted balance, the balance status must be stored on
90 the filesystem from previous run, eg. after it was paused or
91 forcibly interrupted and mounted again with skip_balance
92
93 start [options] <path>
94 start the balance operation according to the specified filters,
95 without any filters the data and metadata from the whole filesystem
96 are moved. The process runs in the foreground.
97
98 Note
99 the balance command without filters will basically move
100 everything in the filesystem to a new physical location on
101 devices (ie. it does not affect the logical properties of file
102 extents like offsets within files and extent sharing). The run
103 time is potentially very long, depending on the filesystem
104 size. To prevent starting a full balance by accident, the user
105 is warned and has a few seconds to cancel the operation before
106 it starts. The warning and delay can be skipped with
107 --full-balance option.
108 Please note that the filters must be written together with the -d,
109 -m and -s options, because they’re optional and bare -d and -m also
110 work and mean no filters.
111
112 Options
113
114 -d[<filters>]
115 act on data block groups, see FILTERS section for details about
116 filters
117
118 -m[<filters>]
119 act on metadata chunks, see FILTERS section for details about
120 filters
121
122 -s[<filters>]
123 act on system chunks (requires -f), see FILTERS section for
124 details about filters.
125
126 -f
127 force a reduction of metadata integrity, eg. when going from
128 raid1 to single
129
130 --background|--bg
131 run the balance operation asynchronously in the background,
132 uses fork(2) to start the process that calls the kernel ioctl
133
134 --enqueue
135 wait if there’s another exclusive operation running, otherwise
136 continue
137
138 -v
139 (deprecated) alias for global -v option
140
141 status [-v] <path>
142 Show status of running or paused balance.
143
144 Options
145
146 -v
147 (deprecated) alias for global -v option
148
150 From kernel 3.3 onwards, btrfs balance can limit its action to a subset
151 of the whole filesystem, and can be used to change the replication
152 configuration (e.g. moving data from single to RAID1). This
153 functionality is accessed through the -d, -m or -s options to btrfs
154 balance start, which filter on data, metadata and system blocks
155 respectively.
156
157 A filter has the following structure: type[=params][,type=...]
158
159 The available types are:
160
161 profiles=<profiles>
162 Balances only block groups with the given profiles. Parameters are
163 a list of profile names separated by "|" (pipe).
164
165 usage=<percent>, usage=<range>
166 Balances only block groups with usage under the given percentage.
167 The value of 0 is allowed and will clean up completely unused block
168 groups, this should not require any new work space allocated. You
169 may want to use usage=0 in case balance is returning ENOSPC and
170 your filesystem is not too full.
171
172 The argument may be a single value or a range. The single value N
173 means at most N percent used, equivalent to ..N range syntax.
174 Kernels prior to 4.4 accept only the single value format. The
175 minimum range boundary is inclusive, maximum is exclusive.
176
177 devid=<id>
178 Balances only block groups which have at least one chunk on the
179 given device. To list devices with ids use btrfs filesystem show.
180
181 drange=<range>
182 Balance only block groups which overlap with the given byte range
183 on any device. Use in conjunction with devid to filter on a
184 specific device. The parameter is a range specified as start..end.
185
186 vrange=<range>
187 Balance only block groups which overlap with the given byte range
188 in the filesystem’s internal virtual address space. This is the
189 address space that most reports from btrfs in the kernel log use.
190 The parameter is a range specified as start..end.
191
192 convert=<profile>
193 Convert each selected block group to the given profile name
194 identified by parameters.
195
196 Note
197 starting with kernel 4.5, the data chunks can be converted
198 to/from the DUP profile on a single device.
199
200 Note
201 starting with kernel 4.6, all profiles can be converted to/from
202 DUP on multi-device filesystems.
203
204 limit=<number>, limit=<range>
205 Process only given number of chunks, after all filters are applied.
206 This can be used to specifically target a chunk in connection with
207 other filters (drange, vrange) or just simply limit the amount of
208 work done by a single balance run.
209
210 The argument may be a single value or a range. The single value N
211 means at most N chunks, equivalent to ..N range syntax. Kernels
212 prior to 4.4 accept only the single value format. The range minimum
213 and maximum are inclusive.
214
215 stripes=<range>
216 Balance only block groups which have the given number of stripes.
217 The parameter is a range specified as start..end. Makes sense for
218 block group profiles that utilize striping, ie. RAID0/10/5/6. The
219 range minimum and maximum are inclusive.
220
221 soft
222 Takes no parameters. Only has meaning when converting between
223 profiles. When doing convert from one profile to another and soft
224 mode is on, chunks that already have the target profile are left
225 untouched. This is useful e.g. when half of the filesystem was
226 converted earlier but got cancelled.
227
228 The soft mode switch is (like every other filter) per-type. For
229 example, this means that we can convert metadata chunks the "hard"
230 way while converting data chunks selectively with soft switch.
231
232 Profile names, used in profiles and convert are one of: raid0, raid1,
233 raid10, raid5, raid6, dup, single. The mixed data/metadata profiles can
234 be converted in the same way, but it’s conversion between mixed and
235 non-mixed is not implemented. For the constraints of the profiles
236 please refer to mkfs.btrfs(8), section PROFILES.
237
239 The way balance operates, it usually needs to temporarily create a new
240 block group and move the old data there, before the old block group can
241 be removed. For that it needs the work space, otherwise it fails for
242 ENOSPC reasons. This is not the same ENOSPC as if the free space is
243 exhausted. This refers to the space on the level of block groups, which
244 are bigger parts of the filesystem that contain many file extents.
245
246 The free work space can be calculated from the output of the btrfs
247 filesystem show command:
248
249 Label: 'BTRFS' uuid: 8a9d72cd-ead3-469d-b371-9c7203276265
250 Total devices 2 FS bytes used 77.03GiB
251 devid 1 size 53.90GiB used 51.90GiB path /dev/sdc2
252 devid 2 size 53.90GiB used 51.90GiB path /dev/sde1
253
254 size - used = free work space 53.90GiB - 51.90GiB = 2.00GiB
255
256 An example of a filter that does not require workspace is usage=0. This
257 will scan through all unused block groups of a given type and will
258 reclaim the space. After that it might be possible to run other
259 filters.
260
261 CONVERSIONS ON MULTIPLE DEVICES
262
263 Conversion to profiles based on striping (RAID0, RAID5/6) require the
264 work space on each device. An interrupted balance may leave partially
265 filled block groups that consume the work space.
266
268 A more comprehensive example when going from one to multiple devices,
269 and back, can be found in section TYPICAL USECASES of btrfs-device(8).
270
271 MAKING BLOCK GROUP LAYOUT MORE COMPACT
272 The layout of block groups is not normally visible; most tools report
273 only summarized numbers of free or used space, but there are still some
274 hints provided.
275
276 Let’s use the following real life example and start with the output:
277
278 $ btrfs filesystem df /path
279 Data, single: total=75.81GiB, used=64.44GiB
280 System, RAID1: total=32.00MiB, used=20.00KiB
281 Metadata, RAID1: total=15.87GiB, used=8.84GiB
282 GlobalReserve, single: total=512.00MiB, used=0.00B
283
284 Roughly calculating for data, 75G - 64G = 11G, the used/total ratio is
285 about 85%. How can we can interpret that:
286
287 · chunks are filled by 85% on average, ie. the usage filter with
288 anything smaller than 85 will likely not affect anything
289
290 · in a more realistic scenario, the space is distributed unevenly, we
291 can assume there are completely used chunks and the remaining are
292 partially filled
293
294 Compacting the layout could be used on both. In the former case it
295 would spread data of a given chunk to the others and removing it. Here
296 we can estimate that roughly 850 MiB of data have to be moved (85% of a
297 1 GiB chunk).
298
299 In the latter case, targeting the partially used chunks will have to
300 move less data and thus will be faster. A typical filter command would
301 look like:
302
303 # btrfs balance start -dusage=50 /path
304 Done, had to relocate 2 out of 97 chunks
305
306 $ btrfs filesystem df /path
307 Data, single: total=74.03GiB, used=64.43GiB
308 System, RAID1: total=32.00MiB, used=20.00KiB
309 Metadata, RAID1: total=15.87GiB, used=8.84GiB
310 GlobalReserve, single: total=512.00MiB, used=0.00B
311
312 As you can see, the total amount of data is decreased by just 1 GiB,
313 which is an expected result. Let’s see what will happen when we
314 increase the estimated usage filter.
315
316 # btrfs balance start -dusage=85 /path
317 Done, had to relocate 13 out of 95 chunks
318
319 $ btrfs filesystem df /path
320 Data, single: total=68.03GiB, used=64.43GiB
321 System, RAID1: total=32.00MiB, used=20.00KiB
322 Metadata, RAID1: total=15.87GiB, used=8.85GiB
323 GlobalReserve, single: total=512.00MiB, used=0.00B
324
325 Now the used/total ratio is about 94% and we moved about 74G - 68G = 6G
326 of data to the remaining blockgroups, ie. the 6GiB are now free of
327 filesystem structures, and can be reused for new data or metadata block
328 groups.
329
330 We can do a similar exercise with the metadata block groups, but this
331 should not typically be necessary, unless the used/total ratio is
332 really off. Here the ratio is roughly 50% but the difference as an
333 absolute number is "a few gigabytes", which can be considered normal
334 for a workload with snapshots or reflinks updated frequently.
335
336 # btrfs balance start -musage=50 /path
337 Done, had to relocate 4 out of 89 chunks
338
339 $ btrfs filesystem df /path
340 Data, single: total=68.03GiB, used=64.43GiB
341 System, RAID1: total=32.00MiB, used=20.00KiB
342 Metadata, RAID1: total=14.87GiB, used=8.85GiB
343 GlobalReserve, single: total=512.00MiB, used=0.00B
344
345 Just 1 GiB decrease, which possibly means there are block groups with
346 good utilization. Making the metadata layout more compact would in turn
347 require updating more metadata structures, ie. lots of IO. As running
348 out of metadata space is a more severe problem, it’s not necessary to
349 keep the utilization ratio too high. For the purpose of this example,
350 let’s see the effects of further compaction:
351
352 # btrfs balance start -musage=70 /path
353 Done, had to relocate 13 out of 88 chunks
354
355 $ btrfs filesystem df .
356 Data, single: total=68.03GiB, used=64.43GiB
357 System, RAID1: total=32.00MiB, used=20.00KiB
358 Metadata, RAID1: total=11.97GiB, used=8.83GiB
359 GlobalReserve, single: total=512.00MiB, used=0.00B
360
361 GETTING RID OF COMPLETELY UNUSED BLOCK GROUPS
362 Normally the balance operation needs a work space, to temporarily move
363 the data before the old block groups gets removed. If there’s no work
364 space, it ends with no space left.
365
366 There’s a special case when the block groups are completely unused,
367 possibly left after removing lots of files or deleting snapshots.
368 Removing empty block groups is automatic since 3.18. The same can be
369 achieved manually with a notable exception that this operation does not
370 require the work space. Thus it can be used to reclaim unused block
371 groups to make it available.
372
373 # btrfs balance start -dusage=0 /path
374
375 This should lead to decrease in the total numbers in the btrfs
376 filesystem df output.
377
379 Unless indicated otherwise below, all btrfs balance subcommands return
380 a zero exit status if they succeed, and non zero in case of failure.
381
382 The pause, cancel, and resume subcommands exit with a status of 2 if
383 they fail because a balance operation was not running.
384
385 The status subcommand exits with a status of 0 if a balance operation
386 is not running, 1 if the command-line usage is incorrect or a balance
387 operation is still running, and 2 on other errors.
388
390 btrfs is part of btrfs-progs. Please refer to the btrfs wiki
391 http://btrfs.wiki.kernel.org for further details.
392
394 mkfs.btrfs(8), btrfs-device(8)
395
396
397
398Btrfs v5.10 01/18/2021 BTRFS-BALANCE(8)