1HWLOC-CALC(1) hwloc HWLOC-CALC(1)
2
3
4
6 hwloc-calc - Operate on cpu mask strings and objects
7
9 hwloc-calc [topology options] [options] <location1> [<location2> [...]
10 ]
11
12 Note that hwloc(7) provides a detailed explanation of the hwloc system
13 and of valid <location> formats; it should be read before reading this
14 man page.
15
17 All topology options must be given before all other options.
18
19 --no-smt, --no-smt=<N>
20 Only keep the first PU per core in the input locations. If
21 <N> is specified, keep the <N>-th instead, if any. PUs are
22 ordered by physical index during this filtering.
23
24 --cpukind <n>
25 --cpukind <infoname>=<infovalue> Only keep PUs whose CPU kind
26 match. Either a single CPU kind is specified as an index, or
27 the info name/value keypair will select matching kinds.
28
29 --restrict <cpuset>
30 Restrict the topology to the given cpuset.
31
32 --restrict nodeset=<nodeset>
33 Restrict the topology to the given nodeset, unless --re‐
34 strict-flags specifies something different.
35
36 --restrict-flags <flags>
37 Enforce flags when restricting the topology. Flags may be
38 given as numeric values or as a comma-separated list of flag
39 names that are passed to hwloc_topology_restrict(). Those
40 names may be substrings of actual flag names as long as a
41 single one matches, for instance bynodeset,memless. The de‐
42 fault is 0 (or none).
43
44 --disallowed
45 Include objects disallowed by administrative limitations.
46
47 -i <file>, --input <file>
48 Read topology from XML file <file> (instead of discovering
49 the topology on the local machine). If <file> is "-", the
50 standard input is used. XML support must have been compiled
51 in to hwloc for this option to be usable.
52
53 -i <directory>, --input <directory>
54 Read topology from <directory> instead of discovering the
55 topology of the local machine. On Linux, the directory may
56 contain the topology files gathered from another machine
57 topology with hwloc-gather-topology. On x86, the directory
58 may contain a cpuid dump gathered with hwloc-gather-cpuid.
59
60 -i <specification>, --input <specification>
61 Simulate a fake hierarchy (instead of discovering the topol‐
62 ogy on the local machine). If <specification> is "node:2
63 pu:3", the topology will contain two NUMA nodes with 3 pro‐
64 cessing units in each of them. The <specification> string
65 must end with a number of PUs.
66
67 --if <format>, --input-format <format>
68 Enforce the input in the given format, among xml, fsroot,
69 cpuid and synthetic.
70
72 All these options must be given after all topology options above.
73
74 -p --physical
75 Use OS/physical indexes instead of logical indexes for both
76 input and output.
77
78 -l --logical
79 Use logical indexes instead of physical/OS indexes for both
80 input and output (default).
81
82 --pi --physical-input
83 Use OS/physical indexes instead of logical indexes for input.
84
85 --li --logical-input
86 Use logical indexes instead of physical/OS indexes for input
87 (default).
88
89 --po --physical-output
90 Use OS/physical indexes instead of logical indexes for out‐
91 put.
92
93 --lo --logical-output
94 Use logical indexes instead of physical/OS indexes for output
95 (default, except for cpusets which are always physical).
96
97 -n --nodeset
98 Interpret both input and output sets as nodesets instead of
99 CPU sets.
100
101 --no --nodeset-output
102 Report nodesets instead of CPU sets.
103
104 --ni --nodeset-input
105 Interpret input sets as nodesets instead of CPU sets.
106
107 -N --number-of <type|depth>
108 Report the number of objects of the given type or depth that
109 intersect the CPU set. This is convenient for finding how
110 many cores, NUMA nodes or PUs are available in a machine.
111
112 When combined with --nodeset or --nodeset-output, the nodeset
113 is considered instead of the CPU set for finding matching ob‐
114 jects. This is useful when reporting the output as a number
115 or set of NUMA nodes.
116
117 -I --intersect <type|depth>
118 Find the list of objects of the given type or depth that in‐
119 tersect the CPU set and report the comma-separated list of
120 their indexes instead of the cpu mask string. This may be
121 used for determining the list of objects above or below the
122 input objects.
123
124 When combined with --physical, the list is convenient to pass
125 to external tools such as taskset or numactl --physcpubind or
126 --membind. This is different from --largest since the latter
127 requires that all reported objects are strictly included in‐
128 side the input objects.
129
130 When combined with --nodeset or --nodeset-output, the nodeset
131 is considered instead of the CPU set for finding matching ob‐
132 jects. This is useful when reporting the output as a number
133 or set of NUMA nodes.
134
135 -H --hierarchical <type1>.<type2>...
136 Find the list of objects of type <type2> that intersect the
137 CPU set and report the space-separated list of their hierar‐
138 chical indexes with respect to <type1>, <type2>, etc. For
139 instance, if package.core is given, the output would be Pack‐
140 age:1.Core:2 Package:2.Core:3 if the input contains the third
141 core of the second package and the fourth core of the third
142 package.
143
144 Only normal CPU-side object types may be used. NUMA nodes
145 cannot.
146
147 --largest Report (in a human readable format) the list of largest ob‐
148 jects which exactly include all input objects (by looking at
149 their CPU sets). None of these output objects intersect each
150 other, and the sum of them is exactly equivalent to the in‐
151 put. No largest object is included in the input This is dif‐
152 ferent from --intersect where reported objects may not be
153 strictly included in the input.
154
155 --local-memory
156 Report the list of NUMA nodes that are local to the input ob‐
157 jects.
158
159 This option is similar to -I numa but the way nodes are se‐
160 lected is different: The selection performed by --local-mem‐
161 ory may be precisely configured with --local-memory-flags,
162 while -I numa just selects all nodes that are somehow local
163 to any of the input objects.
164
165 --local-memory-flags
166 Change the flags used to select local NUMA nodes. Flags may
167 be given as numeric values or as a comma-separated list of
168 flag names that are passed to hwloc_get_local_numan‐
169 ode_objs(). Those names may be substrings of actual flag
170 names as long as a single one matches. The default is 3 (or
171 smaller,larger) which means NUMA nodes are displayed if their
172 locality either contains or is contained in the locality of
173 the given object.
174
175 This option enables --local-memory.
176
177 --best-memattr <name>
178 Enable the listing of local memory nodes with --local-memory,
179 but only display the local node that has the best value for
180 the memory attribute given by <name> (or as an index).
181
182 If the memory attribute values depend on the initiator, the
183 hwloc-calc input objects are used as the initiator.
184
185 Standard attribute names are Capacity, Locality, Bandwidth,
186 and Latency. All existing attributes in the current topology
187 may be listed with
188
189 $ lstopo --memattrs
190
191
192 --sep <sep>
193 Change the field separator in the output. By default, a
194 space is used to separate output objects (for instance when
195 --hierarchical or --largest is given) while a comma is used
196 to separate indexes (for instance when --intersect is given).
197
198 --single Singlify the output to a single CPU.
199
200 --taskset Display CPU set strings in the format recognized by the
201 taskset command-line program instead of hwloc-specific CPU
202 set string format. This option has no impact on the format
203 of input CPU set strings, both formats are always accepted.
204
205 -q --quiet
206 Hide non-fatal error messages. It mostly includes locations
207 pointing to non-existing objects.
208
209 -v --verbose
210 Verbose output.
211
212 --version Report version and exit.
213
214 -h --help Display help message and exit.
215
217 hwloc-calc generates and manipulates CPU mask strings or objects. Both
218 input and output may be either objects (with physical or logical in‐
219 dexes), CPU lists (with physical or logical indexes), or CPU mask
220 strings (always physically indexed). Input location specification is
221 described in hwloc(7).
222
223 If objects or CPU mask strings are given on the command-line, they are
224 combined and a single output is printed. If no object or CPU mask
225 strings are given on the command-line, the program will read the stan‐
226 dard input. It will combine multiple objects or CPU mask strings that
227 are given on the same line of the standard input line with spaces as
228 separators. Different input lines will be processed separately.
229
230 Command-line arguments and options are processed in order. First
231 topology configuration options should be given. Then, for instance,
232 changing the type of input indexes with --li or changing the input
233 topology with -i only affects the processing the following arguments.
234
235 NOTE: It is highly recommended that you read the hwloc(7) overview page
236 before reading this man page. Most of the concepts described in
237 hwloc(7) directly apply to the hwloc-calc utility.
238
240 hwloc-calc's operation is best described through several examples.
241
242 To display the (physical) CPU mask corresponding to the second package:
243
244 $ hwloc-calc package:1
245 0x000000f0
246
247 To display the (physical) CPU mask corresponding to the third pacakge,
248 excluding its even numbered logical processors:
249
250 $ hwloc-calc package:2 ~PU:even
251 0x00000c00
252
253 To convert a cpu mask to human-readable output, the -H option can be
254 used to emit a space-delimited list of locations:
255
256 $ echo 0x000000f0 | hwloc-calc -H package.core
257 Package:1.Core1 Package:1.Core:1 Package:1.Core:2 Package:1.Core:3
258
259 To use some other character (e.g., a comma) instead of spaces in out‐
260 put, use the --sep option:
261
262 $ echo 0x000000f0 | hwloc-calc -H package.core --sep ,
263 Package:1.Core1,Package:1.Core:1,Package:1.Core:2,Package:1.Core:3
264
265 To combine two (physical) CPU masks:
266
267 $ hwloc-calc 0x0000ffff 0xff000000
268 0xff00ffff
269
270 To display the list of logical numbers of processors included in the
271 second package:
272
273 $ hwloc-calc --intersect PU package:1
274 4,5,6,7
275
276 To bind GNU OpenMP threads logically over the whole machine, we need to
277 use physical number output instead:
278
279 $ export GOMP_CPU_AFFINITY=`hwloc-calc --physical-output --inter‐
280 sect PU all`
281 $ echo $GOMP_CPU_AFFINITY
282 0,4,1,5,2,6,3,7
283
284 To display the list of NUMA nodes, by physical indexes, that intersect
285 a given (physical) CPU mask:
286
287 $ hwloc-calc --physical --intersect NUMAnode 0xf0f0f0f0
288 0,2
289
290 To display the list of NUMA nodes, by physical indexes, whose locality
291 is exactly equal to a Package:
292
293 $ hwloc-calc --local-memory-flags 0 pack:1
294 4,7
295
296 To display the best-capacity NUMA node, by physical indexe, whose lo‐
297 cality is exactly equal to a Package:
298
299 $ hwloc-calc --local-memory-flags 0 --best-memattr capacity pack:1
300 4
301
302 Converting object logical indexes (default) from/to physical/OS indexes
303 may be performed with --intersect combined with either --physical-out‐
304 put (logical to physical conversion) or --physical-input (physical to
305 logical):
306
307 $ hwloc-calc --physical-output PU:2 --intersect PU
308 3
309 $ hwloc-calc --physical-input PU:3 --intersect PU
310 2
311
312 One should add --nodeset when converting indexes of memory objects to
313 make sure a single NUMA node index is returned on platforms with het‐
314 erogeneous memory:
315
316 $ hwloc-calc --nodeset --physical-output node:2 --intersect node
317 3
318 $ hwloc-calc --nodeset --physical-input node:3 --intersect node
319 2
320
321 To display the set of CPUs near network interface eth0:
322
323 $ hwloc-calc os=eth0
324 0x00005555
325
326 To display the indexes of packages near PCI device whose bus ID is
327 0000:01:02.0:
328
329 $ hwloc-calc pci=0000:01:02.0 --intersect Package
330 1
331
332 To display the list of per-package cores that intersect the input:
333
334 $ hwloc-calc 0x00003c00 --hierarchical package.core
335 Package:2.Core:1 Package:3.Core:0
336
337 To display the (physical) CPU mask of the entire topology except the
338 third package:
339
340 $ hwloc-calc all ~package:3
341 0x0000f0ff
342
343 To combine both physical and logical indexes as input:
344
345 $ hwloc-calc PU:2 --physical-input PU:3
346 0x0000000c
347
348 To synthetize a set of cores into largest objects on a 2-node 2-package
349 2-core machine:
350
351 $ hwloc-calc core:0 --largest
352 Core:0
353 $ hwloc-calc core:0-1 --largest
354 Package:0
355 $ hwloc-calc core:4-7 --largest
356 NUMANode:1
357 $ hwloc-calc core:2-6 --largest
358 Package:1 Package:2 Core:6
359 $ hwloc-calc pack:2 --largest
360 Package:2
361 $ hwloc-calc package:2-3 --largest
362 NUMANode:1
363
364 To get the set of first threads of all cores:
365
366 $ hwloc-calc core:all.pu:0
367 $ hwloc-calc --no-smt all
368
369 This can also be very useful in order to make GNU OpenMP use exactly
370 one thread per core, and in logical core order:
371
372 $ export OMP_NUM_THREADS=`hwloc-calc --number-of core all`
373 $ echo $OMP_NUM_THREADS
374 4
375 $ export GOMP_CPU_AFFINITY=`hwloc-calc --physical-output --inter‐
376 sect PU --no-smt all`
377 $ echo $GOMP_CPU_AFFINITY
378 0,2,1,3
379
380
382 Upon successful execution, hwloc-calc displays the (physical) CPU mask
383 string, (physical or logical) object list, or (physical or logical) ob‐
384 ject number list. The return value is 0.
385
386 hwloc-calc will return nonzero if any kind of error occurs, such as
387 (but not limited to): failure to parse the command line.
388
390 hwloc(7), lstopo(1), hwloc-info(1)
391
392
393
394
3952.4.1 Feb 11, 2021 HWLOC-CALC(1)