btrfs-balance(8)

1BTRFS-BALANCE(8)                     BTRFS                    BTRFS-BALANCE(8)
2
3
4

NAME

6       btrfs-balance - balance block groups on a btrfs filesystem
7

SYNOPSIS

9       btrfs balance <subcommand> <args>
10

DESCRIPTION

12       The  primary  purpose  of the balance feature is to spread block groups
13       across all devices so they match constraints defined by the  respective
14       profiles.  See  mkfs.btrfs(8)  section  PROFILES for more details.  The
15       scope of the balancing process can be further tuned by use  of  filters
16       that  can  select  the block groups to process. Balance works only on a
17       mounted filesystem.  Extent sharing is preserved and reflinks  are  not
18       broken.   Files are not defragmented nor recompressed, file extents are
19       preserved but the physical location on devices will change.
20
21       The balance operation is cancellable by the user. The on-disk state  of
22       the  filesystem is always consistent so an unexpected interruption (eg.
23       system crash, reboot) does not corrupt the filesystem. The progress  of
24       the  balance  operation  is temporarily stored as an internal state and
25       will be resumed upon mount, unless the  mount  option  skip_balance  is
26       specified.
27
28       WARNING:
29          Running  balance without filters will take a lot of time as it basi‐
30          cally move data/metadata from the whole filesystem and needs to  up‐
31          date all block pointers.
32
33       The filters can be used to perform following actions:
34
35       • convert block group profiles (filter convert)
36
37       • make block group usage more compact  (filter usage)
38
39       • perform actions only on a given device (filters devid, drange)
40
41       The filters can be applied to a combination of block group types (data,
42       metadata, system). Note that changing only the system  type  needs  the
43       force  option.  Otherwise  system gets automatically converted whenever
44       metadata profile is converted.
45
46       When metadata redundancy is reduced (eg.  from  RAID1  to  single)  the
47       force option is also required and it is noted in system log.
48
49       NOTE:
50          The  balance  operation  needs  enough work space, ie. space that is
51          completely unused in the filesystem,  otherwise  this  may  lead  to
52          ENOSPC reports.  See the section ENOSPC for more details.
53

COMPATIBILITY

55       NOTE:
56          The  balance subcommand also exists under the btrfs filesystem name‐
57          space.  This still works for backward compatibility  but  is  depre‐
58          cated and should not be used any more.
59
60       NOTE:
61          A  short  syntax btrfs balance <path> works due to backward compati‐
62          bility but is deprecated and should not be used any more. Use  btrfs
63          balance start command instead.
64

PERFORMANCE IMPLICATIONS

66       Balancing  operations  are  very IO intensive and can also be quite CPU
67       intensive, impacting other  ongoing  filesystem  operations.  Typically
68       large  amounts  of  data  are copied from one location to another, with
69       corresponding metadata updates.
70
71       Depending upon the block group layout, it can also be seek heavy.  Per‐
72       formance  on rotational devices is noticeably worse compared to SSDs or
73       fast arrays.
74

SUBCOMMAND

76       cancel <path>
77              cancels a running or paused balance, the command will block  and
78              wait until the current blockgroup being processed completes
79
80              Since  kernel  5.7 the response time of the cancellation is sig‐
81              nificantly improved, on older kernels it might take a long  time
82              until currently processed chunk is completely finished.
83
84       pause <path>
85              pause  running  balance  operation, this will store the state of
86              the balance progress and used filters to the filesystem
87
88       resume <path>
89              resume interrupted balance, the balance status must be stored on
90              the  filesystem  from  previous  run, eg. after it was paused or
91              forcibly interrupted and mounted again with skip_balance
92
93       start [options] <path>
94              start the balance operation according to the specified  filters,
95              without  any  filters  the  data  and  metadata  from  the whole
96              filesystem are moved. The process runs in the foreground.
97
98              NOTE:
99                 The balance command without filters will basically  move  ev‐
100                 erything  in the filesystem to a new physical location on de‐
101                 vices (ie. it does not affect the logical properties of  file
102                 extents  like  offsets within files and extent sharing).  The
103                 run time is potentially very long, depending on the  filesys‐
104                 tem size. To prevent starting a full balance by accident, the
105                 user is warned and has a few seconds to cancel the  operation
106                 before  it starts.  The warning and delay can be skipped with
107                 --full-bauance option.
108
109              Please note that the filters must be written together  with  the
110              -d,  -m and -s options, because they're optional and bare -d and
111              -m also work and mean no filters.
112
113              NOTE:
114                 When the target profile for conversion  filter  is  raid5  or
115                 raid6,  there's  a safety timeout of 10 seconds to warn users
116                 about the status of the feature
117
118              Options
119
120              -d[<filters>]
121                     act on data block groups, see FILTERS section for details
122                     about filters
123
124              -m[<filters>]
125                     act  on  metadata chunks, see FILTERS section for details
126                     about filters
127
128              -s[<filters>]
129                     act on system chunks (requires -f), see  FILTERS  section
130                     for details about filters.
131
132              -f     force  a  reduction of metadata integrity, eg. when going
133                     from raid1 to single, or skip  safety  timeout  when  the
134                     target conversion profile is raid5 or raid6
135
136              --background|--bg
137                     run  the  balance  operation  asynchronously in the back‐
138                     ground, uses fork(2) to start the process that calls  the
139                     kernel ioctl
140
141              --enqueue
142                     wait if there's another exclusive operation running, oth‐
143                     erwise continue
144
145              -v     (deprecated) alias for global '-v' option
146
147       status [-v] <path>
148              Show status of running or paused balance.
149
150              Options
151
152              -v     (deprecated) alias for global -v option
153

FILTERS

155       From kernel 3.3 onwards, btrfs balance can limit its action to a subset
156       of the whole filesystem, and can be used to change the replication con‐
157       figuration (e.g.  moving data from single to RAID1). This functionality
158       is  accessed  through  the -d, -m or -s options to btrfs balance start,
159       which filter on data, metadata and system blocks respectively.
160
161       A filter has the following structure: type[=params][,type=...]
162
163       The available types are:
164
165       profiles=<profiles>
166              Balances only block groups with the given  profiles.  Parameters
167              are a list of profile names separated by "|" (pipe).
168
169       usage=<percent>, usage=<range>
170              Balances  only  block groups with usage under the given percent‐
171              age. The value of 0 is allowed and will clean up completely  un‐
172              used  block  groups,  this should not require any new work space
173              allocated. You may want to use usage=0 in case  balance  is  re‐
174              turning ENOSPC and your filesystem is not too full.
175
176              The  argument may be a single value or a range. The single value
177              N means at most N percent used, equivalent to ..N range  syntax.
178              Kernels  prior  to 4.4 accept only the single value format.  The
179              minimum range boundary is inclusive, maximum is exclusive.
180
181       devid=<id>
182              Balances only block groups which have at least one chunk on  the
183              given  device.  To  list  devices  with ids use btrfs filesystem
184              show.
185
186       drange=<range>
187              Balance only block groups which  overlap  with  the  given  byte
188              range  on any device. Use in conjunction with devid to filter on
189              a specific  device.  The  parameter  is  a  range  specified  as
190              start..end.
191
192       vrange=<range>
193              Balance  only  block  groups  which  overlap with the given byte
194              range in the filesystem's internal virtual address  space.  This
195              is  the address space that most reports from btrfs in the kernel
196              log use. The parameter is a range specified as start..end.
197
198       convert=<profile>
199              Convert each selected block group  to  the  given  profile  name
200              identified by parameters.
201
202              NOTE:
203                 Starting  with  kernel  4.5, the data chunks can be converted
204                 to/from the DUP profile on a single device.
205
206              NOTE:
207                 Starting with kernel  4.6,  all  profiles  can  be  converted
208                 to/from DUP on multi-device filesystems.
209
210       limit=<number>, limit=<range>
211              Process  only  given number of chunks, after all filters are ap‐
212              plied. This can be used to specifically target a chunk  in  con‐
213              nection with other filters (drange, vrange) or just simply limit
214              the amount of work done by a single balance run.
215
216              The argument may be a single value or a range. The single  value
217              N  means  at most N chunks, equivalent to ..N range syntax. Ker‐
218              nels prior to 4.4 accept only  the  single  value  format.   The
219              range minimum and maximum are inclusive.
220
221       stripes=<range>
222              Balance  only  block  groups  which  have  the  given  number of
223              stripes. The parameter is a range specified as start..end. Makes
224              sense  for  block  group  profiles  that  utilize  striping, ie.
225              RAID0/10/5/6.  The range minimum and maximum are inclusive.
226
227       soft   Takes no parameters. Only has meaning  when  converting  between
228              profiles.   When  doing  convert from one profile to another and
229              soft mode is on, chunks that already have the target profile are
230              left untouched.  This is useful e.g. when half of the filesystem
231              was converted earlier but got cancelled.
232
233              The soft mode switch is (like every other filter) per-type.  For
234              example,  this  means  that  we  can convert metadata chunks the
235              "hard" way while converting data chunks  selectively  with  soft
236              switch.
237
238       Profile  names,  used in profiles and convert are one of: raid0, raid1,
239       raid1c3,  raid1c4,  raid10,  raid5,  raid6,  dup,  single.   The  mixed
240       data/metadata  profiles can be converted in the same way, but it's con‐
241       version between mixed and non-mixed is not implemented.  For  the  con‐
242       straints  of  the  profiles please refer to mkfs.btrfs(8), section PRO‐
243       FILES.
244

ENOSPC

246       The way balance operates, it usually needs to temporarily create a  new
247       block group and move the old data there, before the old block group can
248       be removed.  For that it needs the work space, otherwise it  fails  for
249       ENOSPC  reasons.   This  is not the same ENOSPC as if the free space is
250       exhausted. This refers to the space on the level of block groups, which
251       are bigger parts of the filesystem that contain many file extents.
252
253       The  free  work  space  can  be calculated from the output of the btrfs
254       filesystem show command:
255
256          Label: 'BTRFS'  uuid: 8a9d72cd-ead3-469d-b371-9c7203276265
257                  Total devices 2 FS bytes used 77.03GiB
258                  devid    1 size 53.90GiB used 51.90GiB path /dev/sdc2
259                  devid    2 size 53.90GiB used 51.90GiB path /dev/sde1
260
261       size - used = free work space
262
263       53.90GiB - 51.90GiB = 2.00GiB
264
265       An example of a filter that does not require workspace is usage=0. This
266       will  scan through all unused block groups of a given type and will re‐
267       claim the space. After that it might be possible to run other filters.
268
269       CONVERSIONS ON MULTIPLE DEVICES
270
271       Conversion to profiles based on striping (RAID0, RAID5/6)  require  the
272       work  space  on each device. An interrupted balance may leave partially
273       filled block groups that consume the work space.
274

EXAMPLES

276       A more comprehensive example when going from one to  multiple  devices,
277       and back, can be found in section TYPICAL USECASES of btrfs-device(8).
278
279   MAKING BLOCK GROUP LAYOUT MORE COMPACT
280       The  layout  of block groups is not normally visible; most tools report
281       only summarized numbers of free or used space, but there are still some
282       hints provided.
283
284       Let's use the following real life example and start with the output:
285
286          $ btrfs filesystem df /path
287          Data, single: total=75.81GiB, used=64.44GiB
288          System, RAID1: total=32.00MiB, used=20.00KiB
289          Metadata, RAID1: total=15.87GiB, used=8.84GiB
290          GlobalReserve, single: total=512.00MiB, used=0.00B
291
292       Roughly  calculating for data, 75G - 64G = 11G, the used/total ratio is
293       about 85%. How can we can interpret that:
294
295       • chunks are filled by 85% on average, ie. the usage filter  with  any‐
296         thing smaller than 85 will likely not affect anything
297
298       • in  a  more realistic scenario, the space is distributed unevenly, we
299         can assume there are completely used chunks  and  the  remaining  are
300         partially filled
301
302       Compacting  the  layout  could  be  used on both. In the former case it
303       would spread data of a given chunk to the others and removing it.  Here
304       we can estimate that roughly 850 MiB of data have to be moved (85% of a
305       1 GiB chunk).
306
307       In the latter case, targeting the partially used chunks  will  have  to
308       move  less data and thus will be faster. A typical filter command would
309       look like:
310
311          # btrfs balance start -dusage=50 /path
312          Done, had to relocate 2 out of 97 chunks
313
314          $ btrfs filesystem df /path
315          Data, single: total=74.03GiB, used=64.43GiB
316          System, RAID1: total=32.00MiB, used=20.00KiB
317          Metadata, RAID1: total=15.87GiB, used=8.84GiB
318          GlobalReserve, single: total=512.00MiB, used=0.00B
319
320       As you can see, the total amount of data is decreased by  just  1  GiB,
321       which  is  an  expected  result. Let's see what will happen when we in‐
322       crease the estimated usage filter.
323
324          # btrfs balance start -dusage=85 /path
325          Done, had to relocate 13 out of 95 chunks
326
327          $ btrfs filesystem df /path
328          Data, single: total=68.03GiB, used=64.43GiB
329          System, RAID1: total=32.00MiB, used=20.00KiB
330          Metadata, RAID1: total=15.87GiB, used=8.85GiB
331          GlobalReserve, single: total=512.00MiB, used=0.00B
332
333       Now the used/total ratio is about 94% and we moved about 74G - 68G = 6G
334       of  data  to  the  remaining  blockgroups, ie. the 6GiB are now free of
335       filesystem structures, and can be reused for new data or metadata block
336       groups.
337
338       We  can  do a similar exercise with the metadata block groups, but this
339       should not typically be necessary, unless the used/total ratio  is  re‐
340       ally  off. Here the ratio is roughly 50% but the difference as an abso‐
341       lute number is "a few gigabytes", which can be considered normal for  a
342       workload with snapshots or reflinks updated frequently.
343
344          # btrfs balance start -musage=50 /path
345          Done, had to relocate 4 out of 89 chunks
346
347          $ btrfs filesystem df /path
348          Data, single: total=68.03GiB, used=64.43GiB
349          System, RAID1: total=32.00MiB, used=20.00KiB
350          Metadata, RAID1: total=14.87GiB, used=8.85GiB
351          GlobalReserve, single: total=512.00MiB, used=0.00B
352
353       Just  1  GiB decrease, which possibly means there are block groups with
354       good utilization. Making the metadata layout more compact would in turn
355       require  updating  more metadata structures, ie. lots of IO. As running
356       out of metadata space is a more severe problem, it's not  necessary  to
357       keep  the  utilization ratio too high. For the purpose of this example,
358       let's see the effects of further compaction:
359
360          # btrfs balance start -musage=70 /path
361          Done, had to relocate 13 out of 88 chunks
362
363          $ btrfs filesystem df .
364          Data, single: total=68.03GiB, used=64.43GiB
365          System, RAID1: total=32.00MiB, used=20.00KiB
366          Metadata, RAID1: total=11.97GiB, used=8.83GiB
367          GlobalReserve, single: total=512.00MiB, used=0.00B
368
369   GETTING RID OF COMPLETELY UNUSED BLOCK GROUPS
370       Normally the balance operation needs a work space, to temporarily  move
371       the  data  before the old block groups gets removed. If there's no work
372       space, it ends with no space left.
373
374       There's a special case when the block  groups  are  completely  unused,
375       possibly  left  after removing lots of files or deleting snapshots. Re‐
376       moving empty block groups is automatic since  3.18.  The  same  can  be
377       achieved manually with a notable exception that this operation does not
378       require the work space. Thus it can be used  to  reclaim  unused  block
379       groups to make it available.
380
381          # btrfs balance start -dusage=0 /path
382
383       This should lead to decrease in the total numbers in the btrfs filesys‐
384       tem df output.
385

EXIT STATUS

387       Unless indicated otherwise below, all btrfs balance subcommands  return
388       a zero exit status if they succeed, and non zero in case of failure.
389
390       The  pause,  cancel,  and resume subcommands exit with a status of 2 if
391       they fail because a balance operation was not running.
392
393       The status subcommand exits with a status of 0 if a  balance  operation
394       is  not  running, 1 if the command-line usage is incorrect or a balance
395       operation is still running, and 2 on other errors.
396

AVAILABILITY

398       btrfs  is  part  of  btrfs-progs.   Please  refer  to  the  btrfs  wiki
399       http://btrfs.wiki.kernel.org for further details.
400

COPYRIGHT

405       2022
406
407
408
409
4105.18                             May 25, 2022                 BTRFS-BALANCE(8)