1CEPH-BLUESTORE-TOOL(8) Ceph CEPH-BLUESTORE-TOOL(8)
2
3
4
6 ceph-bluestore-tool - bluestore administrative tool
7
9 ceph-bluestore-tool command
10 [ --dev device ... ]
11 [ --path osd path ]
12 [ --out-dir dir ]
13 [ --log-file | -l filename ]
14 [ --deep ]
15 ceph-bluestore-tool fsck|repair --path osd path [ --deep ]
16 ceph-bluestore-tool show-label --dev device ...
17 ceph-bluestore-tool prime-osd-dir --dev device --path osd path
18 ceph-bluestore-tool bluefs-export --path osd path --out-dir dir
19 ceph-bluestore-tool bluefs-bdev-new-wal --path osd path --dev-target new-device
20 ceph-bluestore-tool bluefs-bdev-new-db --path osd path --dev-target new-device
21 ceph-bluestore-tool bluefs-bdev-migrate --path osd path --dev-target new-device --devs-source device1 [--devs-source device2]
22 ceph-bluestore-tool free-dump|free-score --path osd path [ --allocator block/bluefs-wal/bluefs-db/bluefs-slow ]
23 ceph-bluestore-tool reshard --path osd path --sharding new sharding [ --sharding-ctrl control string ]
24 ceph-bluestore-tool show-sharding --path osd path
25
26
28 ceph-bluestore-tool is a utility to perform low-level administrative
29 operations on a BlueStore instance.
30
32 help
33 show help
34
35 fsck [ --deep ]
36 run consistency check on BlueStore metadata. If --deep is speci‐
37 fied, also read all object data and verify checksums.
38
39 repair
40 Run a consistency check and repair any errors we can.
41
42 bluefs-export
43 Export the contents of BlueFS (i.e., RocksDB files) to an output di‐
44 rectory.
45
46 bluefs-bdev-sizes --path osd path
47 Print the device sizes, as understood by BlueFS, to stdout.
48
49 bluefs-bdev-expand --path osd path
50 Instruct BlueFS to check the size of its block devices and, if they
51 have expanded, make use of the additional space. Please note that
52 only the new files created by BlueFS will be allocated on the pre‐
53 ferred block device if it has enough free space, and the existing
54 files that have spilled over to the slow device will be gradually
55 removed when RocksDB performs compaction. In other words, if there
56 is any data spilled over to the slow device, it will be moved to the
57 fast device over time.
58
59 bluefs-bdev-new-wal --path osd path --dev-target new-device
60 Adds WAL device to BlueFS, fails if WAL device already exists.
61
62 bluefs-bdev-new-db --path osd path --dev-target new-device
63 Adds DB device to BlueFS, fails if DB device already exists.
64
65 bluefs-bdev-migrate --dev-target new-device --devs-source device1
66 [--devs-source device2]
67 Moves BlueFS data from source device(s) to the target one, source
68 devices (except the main one) are removed on success. Target device
69 can be both already attached or new device. In the latter case it's
70 added to OSD replacing one of the source devices. Following replace‐
71 ment rules apply (in the order of precedence, stop on the first
72 match):
73
74 • if source list has DB volume - target device replaces it.
75
76 • if source list has WAL volume - target device replace it.
77
78 • if source list has slow volume only - operation isn't permit‐
79 ted, requires explicit allocation via new-db/new-wal command.
80
81 show-label --dev device [...]
82 Show device label(s).
83
84 free-dump --path osd path [ --allocator
85 block/bluefs-wal/bluefs-db/bluefs-slow ]
86 Dump all free regions in allocator.
87
88 free-score --path osd path [ --allocator
89 block/bluefs-wal/bluefs-db/bluefs-slow ]
90 Give a [0-1] number that represents quality of fragmentation in al‐
91 locator. 0 represents case when all free space is in one chunk. 1
92 represents worst possible fragmentation.
93
94 reshard --path osd path --sharding new sharding [ --resharding-ctrl
95 control string ]
96 Changes sharding of BlueStore's RocksDB. Sharding is build on top of
97 RocksDB column families. This option allows to test performance of
98 new sharding without need to redeploy OSD. Resharding is usually a
99 long process, which involves walking through entire RocksDB key
100 space and moving some of them to different column families. Option
101 --resharding-ctrl provides performance control over resharding
102 process. Interrupted resharding will prevent OSD from running. In‐
103 terrupted resharding does not corrupt data. It is always possible to
104 continue previous resharding, or select any other sharding scheme,
105 including reverting to original one.
106
107 show-sharding --path osd path
108 Show sharding that is currently applied to BlueStore's RocksDB.
109
111 --dev *device*
112 Add device to the list of devices to consider
113
114 --devs-source *device*
115 Add device to the list of devices to consider as sources for mi‐
116 grate operation
117
118 --dev-target *device*
119 Specify target device migrate operation or device to add for
120 adding new DB/WAL.
121
122 --path *osd path*
123 Specify an osd path. In most cases, the device list is inferred
124 from the symlinks present in osd path. This is usually simpler
125 than explicitly specifying the device(s) with --dev.
126
127 --out-dir *dir*
128 Output directory for bluefs-export
129
130 -l, --log-file *log file*
131 file to log to
132
133 --log-level *num*
134 debug log level. Default is 30 (extremely verbose), 20 is very
135 verbose, 10 is verbose, and 1 is not very verbose.
136
137 --deep deep scrub/repair (read and validate object data, not just meta‐
138 data)
139
140 --allocator *name*
141 Useful for free-dump and free-score actions. Selects alloca‐
142 tor(s).
143
144 --resharding-ctrl *control string*
145 Provides control over resharding process. Specifies how often
146 refresh RocksDB iterator, and how large should commit batch be
147 before committing to RocksDB. Option format is: <iterator_re‐
148 fresh_bytes>/<iterator_refresh_keys>/<batch_com‐
149 mit_bytes>/<batch_commit_keys> Default:
150 10000000/10000/1000000/1000
151
153 Every BlueStore block device has a single block label at the beginning
154 of the device. You can dump the contents of the label with:
155
156 ceph-bluestore-tool show-label --dev *device*
157
158 The main device will have a lot of metadata, including information that
159 used to be stored in small files in the OSD data directory. The auxil‐
160 iary devices (db and wal) will only have the minimum required fields
161 (OSD UUID, size, device type, birth time).
162
164 You can generate the content for an OSD data directory that can start
165 up a BlueStore OSD with the prime-osd-dir command:
166
167 ceph-bluestore-tool prime-osd-dir --dev *main device* --path /var/lib/ceph/osd/ceph-*id*
168
170 Some versions of BlueStore were susceptible to BlueFS log growing ex‐
171 tremaly large - beyond the point of making booting OSD impossible. This
172 state is indicated by booting that takes very long and fails in _replay
173 function.
174
175 This can be fixed by::
176 ceph-bluestore-tool fsck --path osd path --bluefs_replay_recov‐
177 ery=true
178
179 It is advised to first check if rescue process would be successfull::
180 ceph-bluestore-tool fsck --path osd path --bluefs_replay_recov‐
181 ery=true --bluefs_replay_recovery_disable_compact=true
182
183 If above fsck is successful fix procedure can be applied.
184
186 ceph-bluestore-tool is part of Ceph, a massively scalable, open-source,
187 distributed storage system. Please refer to the Ceph documentation at
188 http://ceph.com/docs for more information.
189
191 ceph-osd(8)
192
194 2010-2021, Inktank Storage, Inc. and contributors. Licensed under Cre‐
195 ative Commons Attribution Share Alike 3.0 (CC-BY-SA-3.0)
196
197
198
199
200dev Sep 28, 2021 CEPH-BLUESTORE-TOOL(8)