1OSDMAPTOOL(8)                        Ceph                        OSDMAPTOOL(8)
2
3
4

NAME

6       osdmaptool - ceph osd cluster map manipulation tool
7

SYNOPSIS

9       osdmaptool mapfilename [--print] [--createsimple numosd
10       [--pgbits bitsperosd ] ] [--clobber]
11       osdmaptool mapfilename [--import-crush crushmap]
12       osdmaptool mapfilename [--export-crush crushmap]
13       osdmaptool mapfilename [--upmap file] [--upmap-max max-optimizations]
14       [--upmap-deviation max-deviation] [--upmap-pool poolname]
15       [--save] [--upmap-active]
16       osdmaptool mapfilename [--upmap-cleanup] [--upmap file]
17
18

DESCRIPTION

20       osdmaptool  is a utility that lets you create, view, and manipulate OSD
21       cluster maps from the Ceph distributed storage system. Notably, it lets
22       you  extract  the embedded CRUSH map or import a new CRUSH map.  It can
23       also simulate the upmap balancer mode so you can get a sense of what is
24       needed to balance your PGs.
25

OPTIONS

27       --print
28              will simply make the tool print a plaintext dump of the map, af‐
29              ter any modifications are made.
30
31       --dump <format>
32              displays the map in plain text when <format> is 'plain',  'json'
33              if  specified format is not supported. This is an alternative to
34              the print option.
35
36       --clobber
37              will allow osdmaptool to overwrite mapfilename  if  changes  are
38              made.
39
40       --import-crush mapfile
41              will  load  the  CRUSH  map from mapfile and embed it in the OSD
42              map.
43
44       --export-crush mapfile
45              will extract the CRUSH map from the OSD map and write it to map‐
46              file.
47
48       --createsimple numosd [--pg-bits bitsperosd] [--pgp-bits bits]
49              will  create  a  relatively  generic OSD map with the numosd de‐
50              vices.  If --pg-bits is specified, the initial  placement  group
51              counts  will  be  set with bitsperosd bits per OSD. That is, the
52              pg_num map attribute will be set to numosd shifted  by  bitsper‐
53              osd.  If --pgp-bits is specified, then the pgp_num map attribute
54              will be set to numosd shifted by bits.
55
56       --create-from-conf
57              creates an osd map with default configurations.
58
59       --test-map-pgs  [--pool  poolid]  [--range-first  <first>  --range-last
60       <last>]
61              will  print  out the mappings from placement groups to OSDs.  If
62              range is specified, then it iterates from first to last  in  the
63              directory  specified  by argument to osdmaptool.  Eg: osdmaptool
64              --test-map-pgs --range-first 0 --range-last 2 osdmap_dir.   This
65              will iterate through the files named 0,1,2 in osdmap_dir.
66
67       --test-map-pgs-dump [--pool poolid] [--range-first <first> --range-last
68       <last>]
69              will print out the summary of all placement groups and the  map‐
70              pings from them to the mapped OSDs.  If range is specified, then
71              it iterates from first to last in the directory specified by ar‐
72              gument   to   osdmaptool.   Eg:  osdmaptool  --test-map-pgs-dump
73              --range-first 0 --range-last 2 osdmap_dir.   This  will  iterate
74              through the files named 0,1,2 in osdmap_dir.
75
76       --test-map-pgs-dump-all    [--pool   poolid]   [--range-first   <first>
77       --range-last <last>]
78              will print out the summary of all placement groups and the  map‐
79              pings from them to all the OSDs.  If range is specified, then it
80              iterates from first to last in the directory specified by  argu‐
81              ment  to  osdmaptool.   Eg:  osdmaptool  --test-map-pgs-dump-all
82              --range-first 0 --range-last 2 osdmap_dir.   This  will  iterate
83              through the files named 0,1,2 in osdmap_dir.
84
85       --test-random
86              does a random mapping of placement groups to the OSDs.
87
88       --test-map-pg <pgid>
89              map a particular placement group(specified by pgid) to the OSDs.
90
91       --test-map-object <objectname> [--pool <poolid>]
92              map a particular placement group(specified by objectname) to the
93              OSDs.
94
95       --test-crush [--range-first <first> --range-last <last>]
96              map placement groups to acting OSDs.   If  range  is  specified,
97              then  it  iterates from first to last in the directory specified
98              by  argument  to  osdmaptool.    Eg:   osdmaptool   --test-crush
99              --range-first  0  --range-last  2 osdmap_dir.  This will iterate
100              through the files named 0,1,2 in osdmap_dir.
101
102       --mark-up-in
103              mark osds up and in (but do not persist).
104
105       --mark-out
106              mark an osd as out (but do not persist)
107
108       --mark-up <osdid>
109              mark an osd as up (but do not persist)
110
111       --mark-in <osdid>
112              mark an osd as in (but do not persist)
113
114       --tree Displays a hierarchical tree of the map.
115
116       --clear-temp
117              clears pg_temp and primary_temp variables.
118
119       --clean-temps
120              clean pg_temps.
121
122       --health
123              dump health checks
124
125       --with-default-pool
126              include default pool when creating map
127
128       --upmap-cleanup <file>
129              clean up pg_upmap[_items] entries, writing  commands  to  <file>
130              [default: - for stdout]
131
132       --upmap <file>
133              calculate pg upmap entries to balance pg layout writing commands
134              to <file> [default: - for stdout]
135
136       --upmap-max <max-optimizations>
137              set max upmap entries to calculate [default: 10]
138
139       --upmap-deviation <max-deviation>
140              max deviation from target [default: 5]
141
142       --upmap-pool <poolname>
143              restrict upmap balancing to 1 pool or the option can be repeated
144              for multiple pools
145
146       --upmap-active
147              Act  like  an  active balancer, keep applying changes until bal‐
148              anced
149
150       --adjust-crush-weight <osdid:weight>[,<osdid:weight>,<...>]
151              Change CRUSH weight of <osdid>
152
153       --save write modified osdmap with upmap or crush-adjust changes
154
155       --read <file>
156              calculate pg upmap entries to balance pg primaries
157
158       --read-pool <poolname>
159              specify which pool the read balancer should adjust
160
161       --vstart
162              prefix upmap and read output with './bin/'
163

EXAMPLE

165       To create a simple map with 16 devices:
166
167          osdmaptool --createsimple 16 osdmap --clobber
168
169       To view the result:
170
171          osdmaptool --print osdmap
172
173       To view the mappings of placement groups for pool 1:
174
175          osdmaptool osdmap --test-map-pgs-dump --pool 1
176
177          pool 1 pg_num 8
178          1.0     [0,2,1] 0
179          1.1     [2,0,1] 2
180          1.2     [0,1,2] 0
181          1.3     [2,0,1] 2
182          1.4     [0,2,1] 0
183          1.5     [0,2,1] 0
184          1.6     [0,1,2] 0
185          1.7     [1,0,2] 1
186          #osd    count   first   primary c wt    wt
187          osd.0   8       5       5       1       1
188          osd.1   8       1       1       1       1
189          osd.2   8       2       2       1       1
190           in 3
191           avg 8 stddev 0 (0x) (expected 2.3094 0.288675x))
192           min osd.0 8
193           max osd.0 8
194          size 0  0
195          size 1  0
196          size 2  0
197          size 3  8
198
199       In which,
200
201              1. pool 1 has 8 placement groups. And two tables follow:
202
203              2. A table for placement groups. Each row presents  a  placement
204                 group. With columns of:
205
206                 • placement group id,
207
208                 • acting set, and
209
210                 • primary OSD.
211
212              3. A  table for all OSDs. Each row presents an OSD. With columns
213                 of:
214
215                 • count of placement groups being mapped to this OSD,
216
217                 • count of placement groups where this OSD is the  first  one
218                   in their acting sets,
219
220                 • count  of placement groups where this OSD is the primary of
221                   them,
222
223                 • the CRUSH weight of this OSD, and
224
225                 • the weight of this OSD.
226
227              4. Looking at the number of placement groups held by 3 OSDs.  We
228                 have
229
230                 • average,  stddev, stddev/average, expected stddev, expected
231                   stddev / average
232
233                 • min and max
234
235              5. The number of placement groups mapping to  n  OSDs.  In  this
236                 case, all 8 placement groups are mapping to 3 different OSDs.
237
238       In a less-balanced cluster, we could have following output for the sta‐
239       tistics of placement group distribution, whose  standard  deviation  is
240       1.41421:
241
242          #osd    count   first   primary c wt    wt
243          osd.0   8       5       5       1       1
244          osd.1   8       1       1       1       1
245          osd.2   8       2       2       1       1
246
247          #osd    count   first    primary c wt    wt
248          osd.0   33      9        9       0.0145874     1
249          osd.1   34      14       14      0.0145874     1
250          osd.2   31      7        7       0.0145874     1
251          osd.3   31      13       13      0.0145874     1
252          osd.4   30      14       14      0.0145874     1
253          osd.5   33      7        7       0.0145874     1
254           in 6
255           avg 32 stddev 1.41421 (0.0441942x) (expected 5.16398 0.161374x))
256           min osd.4 30
257           max osd.1 34
258          size 00
259          size 10
260          size 20
261          size 364
262
263       To simulate the active balancer in upmap mode:
264
265               osdmaptool --upmap upmaps.out --upmap-active --upmap-deviation 6 --upmap-max 11 osdmap
266
267          osdmaptool: osdmap file 'osdmap'
268          writing upmap command output to: upmaps.out
269          checking for upmap cleanups
270          upmap, max-count 11, max deviation 6
271          pools movies photos metadata data
272          prepared 11/11 changes
273          Time elapsed 0.00310404 secs
274          pools movies photos metadata data
275          prepared 11/11 changes
276          Time elapsed 0.00283402 secs
277          pools data metadata movies photos
278          prepared 11/11 changes
279          Time elapsed 0.003122 secs
280          pools photos metadata data movies
281          prepared 11/11 changes
282          Time elapsed 0.00324372 secs
283          pools movies metadata data photos
284          prepared 1/11 changes
285          Time elapsed 0.00222609 secs
286          pools data movies photos metadata
287          prepared 0/11 changes
288          Time elapsed 0.00209916 secs
289          Unable to find further optimization, or distribution is already perfect
290          osd.0 pgs 41
291          osd.1 pgs 42
292          osd.2 pgs 42
293          osd.3 pgs 41
294          osd.4 pgs 46
295          osd.5 pgs 39
296          osd.6 pgs 39
297          osd.7 pgs 43
298          osd.8 pgs 41
299          osd.9 pgs 46
300          osd.10 pgs 46
301          osd.11 pgs 46
302          osd.12 pgs 46
303          osd.13 pgs 41
304          osd.14 pgs 40
305          osd.15 pgs 40
306          osd.16 pgs 39
307          osd.17 pgs 46
308          osd.18 pgs 46
309          osd.19 pgs 39
310          osd.20 pgs 42
311          Total time elapsed 0.0167765 secs, 5 rounds
312
313       To  simulate the active balancer in read mode, first make sure capacity
314       is balanced by running the balancer in upmap mode.  Then,  balance  the
315       reads on a replicated pool with:
316
317               osdmaptool osdmap --read read.out --read-pool <pool name>
318
319          ./bin/osdmaptool: osdmap file 'om'
320          writing upmap command output to: read.out
321
322          ---------- BEFORE ------------
323          osd.0 | primary affinity: 1 | number of prims: 3
324          osd.1 | primary affinity: 1 | number of prims: 10
325          osd.2 | primary affinity: 1 | number of prims: 3
326
327          read_balance_score of 'cephfs.a.meta': 1.88
328
329
330          ---------- AFTER ------------
331          osd.0 | primary affinity: 1 | number of prims: 5
332          osd.1 | primary affinity: 1 | number of prims: 5
333          osd.2 | primary affinity: 1 | number of prims: 6
334
335          read_balance_score of 'cephfs.a.meta': 1.13
336
337
338          num changes: 5
339

AVAILABILITY

341       osdmaptool is part of Ceph, a massively scalable, open-source, distrib‐
342       uted storage  system.   Please  refer  to  the  Ceph  documentation  at
343       https://docs.ceph.com for more information.
344

SEE ALSO

346       ceph(8), crushtool(8),
347
349       2010-2023,  Inktank Storage, Inc. and contributors. Licensed under Cre‐
350       ative Commons Attribution Share Alike 3.0 (CC-BY-SA-3.0)
351
352
353
354
355dev                              Nov 15, 2023                    OSDMAPTOOL(8)
Impressum