btrfs-balance(8)

1BTRFS-BALANCE(8)                 Btrfs Manual                 BTRFS-BALANCE(8)
2
3
4

NAME

6       btrfs-balance - balance block groups on a btrfs filesystem
7

SYNOPSIS

9       btrfs balance <subcommand> <args>
10

DESCRIPTION

12       The primary purpose of the balance feature is to spread block groups
13       across all devices so they match constraints defined by the respective
14       profiles. See mkfs.btrfs(8) section PROFILES for more details. The
15       scope of the balancing process can be further tuned by use of filters
16       that can select the block groups to process. Balance works only on a
17       mounted filesystem. Extent sharing is preserved and reflinks are not
18       broken. Files are not defragmented nor recompressed, file extents are
19       preserved but the physical location on devices will change.
20
21       The balance operation is cancellable by the user. The on-disk state of
22       the filesystem is always consistent so an unexpected interruption (eg.
23       system crash, reboot) does not corrupt the filesystem. The progress of
24       the balance operation is temporarily stored as an internal state and
25       will be resumed upon mount, unless the mount option skip_balance is
26       specified.
27
28           Warning
29           running balance without filters will take a lot of time as it
30           basically move data/metadata from the whol filesystem and needs to
31           update all block pointers.
32
33       The filters can be used to perform following actions:
34
35       ·   convert block group profiles (filter convert)
36
37       ·   make block group usage more compact (filter usage)
38
39       ·   perform actions only on a given device (filters devid, drange)
40
41       The filters can be applied to a combination of block group types (data,
42       metadata, system). Note that changing only the system type needs the
43       force option. Otherwise system gets automatically converted whenever
44       metadata profile is converted.
45
46       When metadata redundancy is reduced (eg. from RAID1 to single) the
47       force option is also required and it is noted in system log.
48
49           Note
50           the balance operation needs enough work space, ie. space that is
51           completely unused in the filesystem, otherwise this may lead to
52           ENOSPC reports. See the section ENOSPC for more details.
53

COMPATIBILITY

55           Note
56           The balance subcommand also exists under the btrfs filesystem
57           namespace. This still works for backward compatibility but is
58           deprecated and should not be used any more.
59
60           Note
61           A short syntax btrfs balance <path> works due to backward
62           compatibility but is deprecated and should not be used any more.
63           Use btrfs balance start command instead.
64

PERFORMANCE IMPLICATIONS

66       Balancing operations are very IO intensive and can also be quite CPU
67       intensive, impacting other ongoing filesystem operations. Typically
68       large amounts of data are copied from one location to another, with
69       corresponding metadata updates.
70
71       Depending upon the block group layout, it can also be seek heavy.
72       Performance on rotational devices is noticeably worse compared to SSDs
73       or fast arrays.
74

SUBCOMMAND

76       cancel <path>
77           cancels a running or paused balance, the command will block and
78           wait until the current blockgroup being processed completes
79
80           Since kernel 5.7 the response time of the cancellation is
81           significantly improved, on older kernels it might take a long time
82           until currently processed chunk is completely finished.
83
84       pause <path>
85           pause running balance operation, this will store the state of the
86           balance progress and used filters to the filesystem
87
88       resume <path>
89           resume interrupted balance, the balance status must be stored on
90           the filesystem from previous run, eg. after it was paused or
91           forcibly interrupted and mounted again with skip_balance
92
93       start [options] <path>
94           start the balance operation according to the specified filters,
95           without any filters the data and metadata from the whole filesystem
96           are moved. The process runs in the foreground.
97
98               Note
99               the balance command without filters will basically move
100               everything in the filesystem to a new physical location on
101               devices (ie. it does not affect the logical properties of file
102               extents like offsets within files and extent sharing). The run
103               time is potentially very long, depending on the filesystem
104               size. To prevent starting a full balance by accident, the user
105               is warned and has a few seconds to cancel the operation before
106               it starts. The warning and delay can be skipped with
107               --full-balance option.
108           Please note that the filters must be written together with the -d,
109           -m and -s options, because they’re optional and bare -d and -m also
110           work and mean no filters.
111
112           Options
113
114           -d[<filters>]
115               act on data block groups, see FILTERS section for details about
116               filters
117
118           -m[<filters>]
119               act on metadata chunks, see FILTERS section for details about
120               filters
121
122           -s[<filters>]
123               act on system chunks (requires -f), see FILTERS section for
124               details about filters.
125
126           -f
127               force a reduction of metadata integrity, eg. when going from
128               raid1 to single
129
130           --background|--bg
131               run the balance operation asynchronously in the background,
132               uses fork(2) to start the process that calls the kernel ioctl
133
134           --enqueue
135               wait if there’s another exclusive operation running, otherwise
136               continue
137
138           -v
139               (deprecated) alias for global -v option
140
141       status [-v] <path>
142           Show status of running or paused balance.
143
144           Options
145
146           -v
147               (deprecated) alias for global -v option
148

FILTERS

150       From kernel 3.3 onwards, btrfs balance can limit its action to a subset
151       of the whole filesystem, and can be used to change the replication
152       configuration (e.g. moving data from single to RAID1). This
153       functionality is accessed through the -d, -m or -s options to btrfs
154       balance start, which filter on data, metadata and system blocks
155       respectively.
156
157       A filter has the following structure: type[=params][,type=...]
158
159       The available types are:
160
161       profiles=<profiles>
162           Balances only block groups with the given profiles. Parameters are
163           a list of profile names separated by "|" (pipe).
164
165       usage=<percent>, usage=<range>
166           Balances only block groups with usage under the given percentage.
167           The value of 0 is allowed and will clean up completely unused block
168           groups, this should not require any new work space allocated. You
169           may want to use usage=0 in case balance is returning ENOSPC and
170           your filesystem is not too full.
171
172           The argument may be a single value or a range. The single value N
173           means at most N percent used, equivalent to ..N range syntax.
174           Kernels prior to 4.4 accept only the single value format. The
175           minimum range boundary is inclusive, maximum is exclusive.
176
177       devid=<id>
178           Balances only block groups which have at least one chunk on the
179           given device. To list devices with ids use btrfs filesystem show.
180
181       drange=<range>
182           Balance only block groups which overlap with the given byte range
183           on any device. Use in conjunction with devid to filter on a
184           specific device. The parameter is a range specified as start..end.
185
186       vrange=<range>
187           Balance only block groups which overlap with the given byte range
188           in the filesystem’s internal virtual address space. This is the
189           address space that most reports from btrfs in the kernel log use.
190           The parameter is a range specified as start..end.
191
192       convert=<profile>
193           Convert each selected block group to the given profile name
194           identified by parameters.
195
196               Note
197               starting with kernel 4.5, the data chunks can be converted
198               to/from the DUP profile on a single device.
199
200               Note
201               starting with kernel 4.6, all profiles can be converted to/from
202               DUP on multi-device filesystems.
203
204       limit=<number>, limit=<range>
205           Process only given number of chunks, after all filters are applied.
206           This can be used to specifically target a chunk in connection with
207           other filters (drange, vrange) or just simply limit the amount of
208           work done by a single balance run.
209
210           The argument may be a single value or a range. The single value N
211           means at most N chunks, equivalent to ..N range syntax. Kernels
212           prior to 4.4 accept only the single value format. The range minimum
213           and maximum are inclusive.
214
215       stripes=<range>
216           Balance only block groups which have the given number of stripes.
217           The parameter is a range specified as start..end. Makes sense for
218           block group profiles that utilize striping, ie. RAID0/10/5/6. The
219           range minimum and maximum are inclusive.
220
221       soft
222           Takes no parameters. Only has meaning when converting between
223           profiles. When doing convert from one profile to another and soft
224           mode is on, chunks that already have the target profile are left
225           untouched. This is useful e.g. when half of the filesystem was
226           converted earlier but got cancelled.
227
228           The soft mode switch is (like every other filter) per-type. For
229           example, this means that we can convert metadata chunks the "hard"
230           way while converting data chunks selectively with soft switch.
231
232       Profile names, used in profiles and convert are one of: raid0, raid1,
233       raid10, raid5, raid6, dup, single. The mixed data/metadata profiles can
234       be converted in the same way, but it’s conversion between mixed and
235       non-mixed is not implemented. For the constraints of the profiles
236       please refer to mkfs.btrfs(8), section PROFILES.
237

ENOSPC

239       The way balance operates, it usually needs to temporarily create a new
240       block group and move the old data there, before the old block group can
241       be removed. For that it needs the work space, otherwise it fails for
242       ENOSPC reasons. This is not the same ENOSPC as if the free space is
243       exhausted. This refers to the space on the level of block groups, which
244       are bigger parts of the filesystem that contain many file extents.
245
246       The free work space can be calculated from the output of the btrfs
247       filesystem show command:
248
249              Label: 'BTRFS'  uuid: 8a9d72cd-ead3-469d-b371-9c7203276265
250                      Total devices 2 FS bytes used 77.03GiB
251                      devid    1 size 53.90GiB used 51.90GiB path /dev/sdc2
252                      devid    2 size 53.90GiB used 51.90GiB path /dev/sde1
253
254       size - used = free work space 53.90GiB - 51.90GiB = 2.00GiB
255
256       An example of a filter that does not require workspace is usage=0. This
257       will scan through all unused block groups of a given type and will
258       reclaim the space. After that it might be possible to run other
259       filters.
260
261       CONVERSIONS ON MULTIPLE DEVICES
262
263       Conversion to profiles based on striping (RAID0, RAID5/6) require the
264       work space on each device. An interrupted balance may leave partially
265       filled block groups that consume the work space.
266

EXAMPLES

268       A more comprehensive example when going from one to multiple devices,
269       and back, can be found in section TYPICAL USECASES of btrfs-device(8).
270
271   MAKING BLOCK GROUP LAYOUT MORE COMPACT
272       The layout of block groups is not normally visible; most tools report
273       only summarized numbers of free or used space, but there are still some
274       hints provided.
275
276       Let’s use the following real life example and start with the output:
277
278           $ btrfs filesystem df /path
279           Data, single: total=75.81GiB, used=64.44GiB
280           System, RAID1: total=32.00MiB, used=20.00KiB
281           Metadata, RAID1: total=15.87GiB, used=8.84GiB
282           GlobalReserve, single: total=512.00MiB, used=0.00B
283
284       Roughly calculating for data, 75G - 64G = 11G, the used/total ratio is
285       about 85%. How can we can interpret that:
286
287       ·   chunks are filled by 85% on average, ie. the usage filter with
288           anything smaller than 85 will likely not affect anything
289
290       ·   in a more realistic scenario, the space is distributed unevenly, we
291           can assume there are completely used chunks and the remaining are
292           partially filled
293
294       Compacting the layout could be used on both. In the former case it
295       would spread data of a given chunk to the others and removing it. Here
296       we can estimate that roughly 850 MiB of data have to be moved (85% of a
297       1 GiB chunk).
298
299       In the latter case, targeting the partially used chunks will have to
300       move less data and thus will be faster. A typical filter command would
301       look like:
302
303           # btrfs balance start -dusage=50 /path
304           Done, had to relocate 2 out of 97 chunks
305
306           $ btrfs filesystem df /path
307           Data, single: total=74.03GiB, used=64.43GiB
308           System, RAID1: total=32.00MiB, used=20.00KiB
309           Metadata, RAID1: total=15.87GiB, used=8.84GiB
310           GlobalReserve, single: total=512.00MiB, used=0.00B
311
312       As you can see, the total amount of data is decreased by just 1 GiB,
313       which is an expected result. Let’s see what will happen when we
314       increase the estimated usage filter.
315
316           # btrfs balance start -dusage=85 /path
317           Done, had to relocate 13 out of 95 chunks
318
319           $ btrfs filesystem df /path
320           Data, single: total=68.03GiB, used=64.43GiB
321           System, RAID1: total=32.00MiB, used=20.00KiB
322           Metadata, RAID1: total=15.87GiB, used=8.85GiB
323           GlobalReserve, single: total=512.00MiB, used=0.00B
324
325       Now the used/total ratio is about 94% and we moved about 74G - 68G = 6G
326       of data to the remaining blockgroups, ie. the 6GiB are now free of
327       filesystem structures, and can be reused for new data or metadata block
328       groups.
329
330       We can do a similar exercise with the metadata block groups, but this
331       should not typically be necessary, unless the used/total ratio is
332       really off. Here the ratio is roughly 50% but the difference as an
333       absolute number is "a few gigabytes", which can be considered normal
334       for a workload with snapshots or reflinks updated frequently.
335
336           # btrfs balance start -musage=50 /path
337           Done, had to relocate 4 out of 89 chunks
338
339           $ btrfs filesystem df /path
340           Data, single: total=68.03GiB, used=64.43GiB
341           System, RAID1: total=32.00MiB, used=20.00KiB
342           Metadata, RAID1: total=14.87GiB, used=8.85GiB
343           GlobalReserve, single: total=512.00MiB, used=0.00B
344
345       Just 1 GiB decrease, which possibly means there are block groups with
346       good utilization. Making the metadata layout more compact would in turn
347       require updating more metadata structures, ie. lots of IO. As running
348       out of metadata space is a more severe problem, it’s not necessary to
349       keep the utilization ratio too high. For the purpose of this example,
350       let’s see the effects of further compaction:
351
352           # btrfs balance start -musage=70 /path
353           Done, had to relocate 13 out of 88 chunks
354
355           $ btrfs filesystem df .
356           Data, single: total=68.03GiB, used=64.43GiB
357           System, RAID1: total=32.00MiB, used=20.00KiB
358           Metadata, RAID1: total=11.97GiB, used=8.83GiB
359           GlobalReserve, single: total=512.00MiB, used=0.00B
360
361   GETTING RID OF COMPLETELY UNUSED BLOCK GROUPS
362       Normally the balance operation needs a work space, to temporarily move
363       the data before the old block groups gets removed. If there’s no work
364       space, it ends with no space left.
365
366       There’s a special case when the block groups are completely unused,
367       possibly left after removing lots of files or deleting snapshots.
368       Removing empty block groups is automatic since 3.18. The same can be
369       achieved manually with a notable exception that this operation does not
370       require the work space. Thus it can be used to reclaim unused block
371       groups to make it available.
372
373           # btrfs balance start -dusage=0 /path
374
375       This should lead to decrease in the total numbers in the btrfs
376       filesystem df output.
377

EXIT STATUS

379       Unless indicated otherwise below, all btrfs balance subcommands return
380       a zero exit status if they succeed, and non zero in case of failure.
381
382       The pause, cancel, and resume subcommands exit with a status of 2 if
383       they fail because a balance operation was not running.
384
385       The status subcommand exits with a status of 0 if a balance operation
386       is not running, 1 if the command-line usage is incorrect or a balance
387       operation is still running, and 2 on other errors.
388

AVAILABILITY

390       btrfs is part of btrfs-progs. Please refer to the btrfs wiki
391       http://btrfs.wiki.kernel.org for further details.
392