1CEPH-BLUESTORE-TOOL(8)               Ceph               CEPH-BLUESTORE-TOOL(8)
2
3
4

NAME

6       ceph-bluestore-tool - bluestore administrative tool
7

SYNOPSIS

9       ceph-bluestore-tool command
10       [ --dev device ... ]
11       [ -i osd_id ]
12       [ --path osd path ]
13       [ --out-dir dir ]
14       [ --log-file | -l filename ]
15       [ --deep ]
16       ceph-bluestore-tool fsck|repair --path osd path [ --deep ]
17       ceph-bluestore-tool qfsck       --path osd path
18       ceph-bluestore-tool allocmap    --path osd path
19       ceph-bluestore-tool restore_cfb --path osd path
20       ceph-bluestore-tool show-label --dev device ...
21       ceph-bluestore-tool prime-osd-dir --dev device --path osd path
22       ceph-bluestore-tool bluefs-export --path osd path --out-dir dir
23       ceph-bluestore-tool bluefs-bdev-new-wal --path osd path --dev-target new-device
24       ceph-bluestore-tool bluefs-bdev-new-db --path osd path --dev-target new-device
25       ceph-bluestore-tool bluefs-bdev-migrate --path osd path --dev-target new-device --devs-source device1 [--devs-source device2]
26       ceph-bluestore-tool free-dump|free-score --path osd path [ --allocator block/bluefs-wal/bluefs-db/bluefs-slow ]
27       ceph-bluestore-tool reshard --path osd path --sharding new sharding [ --sharding-ctrl control string ]
28       ceph-bluestore-tool show-sharding --path osd path
29
30

DESCRIPTION

32       ceph-bluestore-tool  is  a  utility to perform low-level administrative
33       operations on a BlueStore instance.
34

COMMANDS

36       help
37          show help
38
39       fsck [ --deep ]
40          run consistency check on BlueStore metadata.  If  --deep  is  speci‐
41          fied, also read all object data and verify checksums.
42
43       repair
44          Run a consistency check and repair any errors we can.
45
46       qfsck
47          run consistency check on BlueStore metadata comparing allocator data
48          (from RocksDB CFB when exists and if not uses allocation-file)  with
49          ONodes state.
50
51       allocmap
52          performs  the same check done by qfsck and then stores a new alloca‐
53          tion-file (command is disabled by default  and  requires  a  special
54          build)
55
56       restore_cfb
57          Reverses  changes  done  by  the  new  NCB code (either through ceph
58          restart or when running allocmap command)  and  restores  RocksDB  B
59          Column-Family (allocator-map).
60
61       bluefs-export
62          Export the contents of BlueFS (i.e., RocksDB files) to an output di‐
63          rectory.
64
65       bluefs-bdev-sizes --path osd path
66          Print the device sizes, as understood by BlueFS, to stdout.
67
68       bluefs-bdev-expand --path osd path
69          Instruct BlueFS to check the size of its block devices and, if  they
70          have  expanded,  make  use of the additional space. Please note that
71          only the new files created by BlueFS will be allocated on  the  pre‐
72          ferred  block  device  if it has enough free space, and the existing
73          files that have spilled over to the slow device  will  be  gradually
74          removed  when RocksDB performs compaction.  In other words, if there
75          is any data spilled over to the slow device, it will be moved to the
76          fast device over time.
77
78       bluefs-bdev-new-wal --path osd path --dev-target new-device
79          Adds WAL device to BlueFS, fails if WAL device already exists.
80
81       bluefs-bdev-new-db --path osd path --dev-target new-device
82          Adds DB device to BlueFS, fails if DB device already exists.
83
84       bluefs-bdev-migrate   --dev-target   new-device  --devs-source  device1
85       [--devs-source device2]
86          Moves BlueFS data from source device(s) to the  target  one,  source
87          devices  (except the main one) are removed on success. Target device
88          can be both already attached or new device. In the latter case  it's
89          added to OSD replacing one of the source devices. Following replace‐
90          ment rules apply (in the order of  precedence,  stop  on  the  first
91          match):
92
93              • if source list has DB volume - target device replaces it.
94
95              • if source list has WAL volume - target device replace it.
96
97              • if  source list has slow volume only - operation isn't permit‐
98                ted, requires explicit allocation via new-db/new-wal command.
99
100       show-label --dev device [...]
101          Show device label(s).
102
103       free-dump       --path       osd       path        [        --allocator
104       block/bluefs-wal/bluefs-db/bluefs-slow ]
105          Dump all free regions in allocator.
106
107       free-score        --path       osd       path       [       --allocator
108       block/bluefs-wal/bluefs-db/bluefs-slow ]
109          Give a [0-1] number that represents quality of fragmentation in  al‐
110          locator.   0  represents case when all free space is in one chunk. 1
111          represents worst possible fragmentation.
112
113       reshard --path osd path --sharding  new  sharding  [  --resharding-ctrl
114       control string ]
115          Changes sharding of BlueStore's RocksDB. Sharding is build on top of
116          RocksDB column families.  This option allows to test performance  of
117          new  sharding without need to redeploy OSD.  Resharding is usually a
118          long process, which involves  walking  through  entire  RocksDB  key
119          space  and moving some of them to different column families.  Option
120          --resharding-ctrl  provides  performance  control  over   resharding
121          process.  Interrupted resharding will prevent OSD from running.  In‐
122          terrupted resharding does not corrupt data. It is always possible to
123          continue  previous  resharding, or select any other sharding scheme,
124          including reverting to original one.
125
126       show-sharding --path osd path
127          Show sharding that is currently applied to BlueStore's RocksDB.
128

OPTIONS

130       --dev *device*
131              Add device to the list of devices to consider
132
133       -i *osd_id*
134              Operate as OSD osd_id. Connect to monitor for OSD  specific  op‐
135              tions.   If  monitor is unavailable, add --no-mon-config to read
136              from ceph.conf instead.
137
138       --devs-source *device*
139              Add device to the list of devices to consider as sources for mi‐
140              grate operation
141
142       --dev-target *device*
143              Specify  target  device  migrate  operation or device to add for
144              adding new DB/WAL.
145
146       --path *osd path*
147              Specify an osd path.  In most cases, the device list is inferred
148              from  the symlinks present in osd path.  This is usually simpler
149              than explicitly specifying the device(s) with --dev. Not  neces‐
150              sary if -i osd_id is provided.
151
152       --out-dir *dir*
153              Output directory for bluefs-export
154
155       -l, --log-file *log file*
156              file to log to
157
158       --log-level *num*
159              debug  log level.  Default is 30 (extremely verbose), 20 is very
160              verbose, 10 is verbose, and 1 is not very verbose.
161
162       --deep deep scrub/repair (read and validate object data, not just meta‐
163              data)
164
165       --allocator *name*
166              Useful  for  free-dump  and  free-score actions. Selects alloca‐
167              tor(s).
168
169       --resharding-ctrl *control string*
170              Provides control over resharding process.  Specifies  how  often
171              refresh  RocksDB  iterator, and how large should commit batch be
172              before committing to RocksDB. Option  format  is:  <iterator_re‐
173              fresh_bytes>/<iterator_refresh_keys>/<batch_com‐
174              mit_bytes>/<batch_commit_keys>                          Default:
175              10000000/10000/1000000/1000
176

ADDITIONAL CEPH.CONF OPTIONS

178       Any  configuration option that is accepted by OSD can be also passed to
179       ceph-bluestore-tool.  Useful to provide necessary configuration options
180       when  access to monitor/ceph.conf is impossible and -i option cannot be
181       used.
182

DEVICE LABELS

184       Every BlueStore block device has a single block label at the  beginning
185       of the device.  You can dump the contents of the label with:
186
187          ceph-bluestore-tool show-label --dev *device*
188
189       The main device will have a lot of metadata, including information that
190       used to be stored in small files in the OSD data directory.  The auxil‐
191       iary  devices  (db  and wal) will only have the minimum required fields
192       (OSD UUID, size, device type, birth time).
193

OSD DIRECTORY PRIMING

195       You can generate the content for an OSD data directory that  can  start
196       up a BlueStore OSD with the prime-osd-dir command:
197
198          ceph-bluestore-tool prime-osd-dir --dev *main device* --path /var/lib/ceph/osd/ceph-*id*
199

BLUEFS LOG RESCUE

201       Some  versions  of BlueStore were susceptible to BlueFS log growing ex‐
202       tremely large - beyond the point of making booting OSD impossible. This
203       state is indicated by booting that takes very long and fails in _replay
204       function.
205
206       This can be fixed by::
207              ceph-bluestore-tool fsck --path osd path  --bluefs_replay_recov‐
208              ery=true
209
210       It is advised to first check if rescue process would be successful::
211              ceph-bluestore-tool  fsck --path osd path --bluefs_replay_recov‐
212              ery=true --bluefs_replay_recovery_disable_compact=true
213
214       If above fsck is successful fix procedure can be applied.
215

AVAILABILITY

217       ceph-bluestore-tool is part of Ceph, a massively scalable, open-source,
218       distributed  storage  system. Please refer to the Ceph documentation at
219       https://docs.ceph.com for more information.
220

SEE ALSO

222       ceph-osd(8)
223
225       2010-2023, Inktank Storage, Inc. and contributors. Licensed under  Cre‐
226       ative Commons Attribution Share Alike 3.0 (CC-BY-SA-3.0)
227
228
229
230
231dev                              Nov 15, 2023           CEPH-BLUESTORE-TOOL(8)
Impressum