1pahole(1) dwarves pahole(1)
2
3
4
6 pahole - Shows, manipulates data structure layout and pretty prints raw
7 data.
8
10 pahole [options] files
11
13 pahole shows data structure layouts encoded in debugging information
14 formats, DWARF, CTF and BTF being supported.
15
16 This is useful for, among other things: optimizing important data
17 structures by reducing its size, figuring out what is the field sitting
18 at an offset from the start of a data structure, investigating ABI
19 changes and more generally understanding a new codebase you have to
20 work with.
21
22 It also uses these structure layouts to pretty print data feed to its
23 standard input, e.g.:
24
25 $ pahole --header elf64_hdr --prettify /lib/modules/5.8.0-rc6+/build/vmlinux
26 {
27 .e_ident = { 127, 69, 76, 70, 2, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
28 .e_type = 2,
29 .e_machine = 62,
30 .e_version = 1,
31 .e_entry = 16777216,
32 .e_phoff = 64,
33 .e_shoff = 604653784,
34 .e_flags = 0,
35 .e_ehsize = 64,
36 .e_phentsize = 56,
37 .e_phnum = 5,
38 .e_shentsize = 64,
39 .e_shnum = 80,
40 .e_shstrndx = 79,
41 },
42 $
43
44 See the PRETTY PRINTING section for further examples and documentation.
45
46 The files must have associated debugging information. This information
47 may be inside the file itself, in ELF sections, or in another file.
48
49 One way to have this information is to specify the -g option to the
50 compiler when building it. When this is done the information will be
51 stored in an ELF section. For the DWARF debugging information format
52 this, adds, among others, the .debug_info ELF section. For CTF it is
53 found in just one ELF section, .SUNW_ctf. BTF comes in at least the
54 .BTF ELF section, and may come also with the .BTF.ext ELF section.
55
56 The debuginfo packages available in most Linux distributions are also
57 supported by pahole, where the debugging information is available in a
58 separate file.
59
60 By default, pahole shows the layout of all named structs in the files
61 specified.
62
63 If no files are specified, then it will look if the /sys/kernel/btf/vm‐
64 linux is present, using the BTF information present in it about the
65 running kernel, i.e. this works:
66
67 $ pahole list_head
68 struct list_head {
69 struct list_head * next; /* 0 8 */
70 struct list_head * prev; /* 8 8 */
71
72 /* size: 16, cachelines: 1, members: 2 */
73 /* last cacheline: 16 bytes */
74 };
75 $
76
77 If BTF is not present and no file is passed, then a vmlinux that
78 matches the build-id for the running kernel will be looked up in the
79 usual places, including where the kernel debuginfo packages put it,
80 looking for DWARF info instead.
81
82 See the EXAMPLES section for more usage suggestions.
83
84 It also pretty prints whatever is fed to its standard input, according
85 to the type specified, see the EXAMPLE session.
86
87 Use --count to state how many records should be pretty printed.
88
89
91 pahole supports the following options.
92
93
94 -C, --class_name=CLASS_NAMES
95 Show just these classes. This can be a comma separated list of
96 class names or file URLs (e.g.: file://class_list.txt)
97
98
99 -c, --cacheline_size=SIZE
100 Set cacheline size to SIZE bytes.
101
102
103 --sort Sort the output by type name, maybe this will grow to allow
104 sorting by other criteria.
105
106 This is mostly needed so that pretty printing from BTF and DWARF
107 can be comparable when using using multiple threads to load
108 DWARF data, when the order that the types in the compile units
109 is processed is not deterministic.
110
111
112 --compile
113 Generate compileable code, with all definitions for all types,
114 i.e.:
115
116 $ pahole --compile > vmlinux.h
117
118 Produces a header that can be included in a C source file and built. In
119 the example provided it will use the BTF info if available, otherwise
120 will look for a DWARF file matching the running kernel build-id.
121
122
123 --skip_emitting_atomic_typedefs
124 Do not emit 'typedef _Atomic int atomic_int' & friends when used
125 with options like --compile. Use it if the compiler provides
126 these already, as of circa 2022 with gcc 12.2.1 those are not
127 encoded in DWARF so to generate compilable code we need emit
128 those typedefs for the atomic types used in the data structures
129 being emitted from debugging information.
130
131
132 --count=COUNT
133 Pretty print the first COUNT records from input.
134
135
136 --skip=COUNT
137 Skip COUNT input records.
138
139
140 -E, --expand_types
141 Expand class members. Useful to find in what member of inner
142 structs where an offset from the beginning of a struct is.
143
144
145 -F, --format_path
146 Allows specifying a list of debugging formats to try, in order.
147 Right now this includes "btf", "ctf" and "dwarf". The default
148 format path used is equivalent to "-F dwarf,btf,ctf".
149
150
151 --hashbits=BITS
152 Allows specifying the number of bits for the debugging format
153 loader to use. The only one affected so far is the "dwarf" one,
154 its default now is 15, the maximum for it is now 21 bits. Tweak
155 it to see if it improves performance as the kernel evolves and
156 more types and functions have to be loaded.
157
158
159 --hex Print offsets and sizes in hexadecimal.
160
161
162 -r, --rel_offset
163 Show relative offsets of members in inner structs.
164
165
166 -p, --expand_pointers
167 Expand class pointer members.
168
169
170 -R, --reorganize
171 Reorganize struct, demoting and combining bitfields, moving mem‐
172 bers to remove alignment holes and padding.
173
174
175 -S, --show_reorg_steps
176 Show the struct layout at each reorganization step.
177
178
179 -i, --contains=CLASS_NAME
180 Show classes that contains CLASS_NAME.
181
182
183 -a, --anon_include
184 Include anonymous classes.
185
186
187 -A, --nested_anon_include
188 Include nested (inside other structs) anonymous classes.
189
190
191 -B, --bit_holes=NR_HOLES
192 Show only structs at least NR_HOLES bit holes.
193
194
195 -d, --recursive
196 Recursive mode, affects several other flags.
197
198
199 -D, --decl_exclude=PREFIX
200 exclude classes declared in files with PREFIX.
201
202
203 -f, --find_pointers_to=CLASS_NAME
204 Find pointers to CLASS_NAME.
205
206
207 -H, --holes=NR_HOLES
208 Show only structs with at least NR_HOLES holes.
209
210
211 -I, --show_decl_info
212 Show the file and line number where the tags were defined, if
213 available in the debugging information.
214
215
216 --skip_encoding_btf_vars
217 Do not encode VARs in BTF.
218
219
220 --skip_encoding_btf_decl_tag
221 Do not encode decl tags in BTF.
222
223
224 --skip_encoding_btf_enum64
225 Do not encode enum64 in BTF.
226
227
228 --skip_encoding_btf_type_tag
229 Do not encode type tags in BTF.
230
231
232 --skip_encoding_btf_inconsistent_proto
233 Do not encode functions with multiple inconsistent prototypes or
234 unexpected register use for their parameters, where the regis‐
235 ters used do not match calling conventions.
236
237
238 -j, --jobs=N
239 Run N jobs in parallel. Defaults to number of online processors
240 + 10% (like the 'ninja' build system) if no argument is speci‐
241 fied.
242
243
244 -J, --btf_encode
245 Encode BTF information from DWARF, used in the Linux kernel
246 build process when CONFIG_DEBUG_INFO_BTF=y is present, intro‐
247 duced in Linux v5.2. Used to implement features such as BPF CO-
248 RE (Compile Once - Run Everywhere).
249
250 See https://nakryiko.com/posts/bpf-portability-and-co-re/.
251
252
253 --btf_encode_detached=FILENAME
254 Same thing as -J/--btf_encode, but storing the raw BTF info into
255 a separate file.
256
257
258 --btf_encode_force
259 Ignore those symbols found invalid when encoding BTF.
260
261
262 --btf_base=PATH
263 Path to the base BTF file, for instance: vmlinux when encoding
264 kernel module BTF information. This may be inferred when asking
265 for a /sys/kernel/btf/MODULE, when it will be autoconfigured to
266 "/sys/kernel/btf/vmlinux".
267
268
269 --btf_gen_floats
270 Allow producing BTF_KIND_FLOAT entries in systems where the vm‐
271 linux DWARF information has float types.
272
273
274 --btf_gen_optimized
275 Generate BTF for functions with optimization-related suffixes
276 (.isra, .constprop).
277
278
279 --btf_gen_all
280 Allow using all the BTF features supported by pahole.
281
282
283 -l, --show_first_biggest_size_base_type_member
284 Show first biggest size base_type member.
285
286
287 -m, --nr_methods
288 Show number of methods of all classes, i.e. the number of func‐
289 tions have arguments that are pointers to a given class.
290
291 To get the number of methods for an specific class, please use:
292
293 $ pahole --nr_methods | grep -w sock
294 sock 1005
295 $
296
297 In the above example it used the BTF information in /sys/ker‐
298 nel/btf/vmlinux.
299
300
301 -M, --show_only_data_members
302 Show only the members that use space in the class layout. C++
303 methods will be suppressed.
304
305
306 -n, --nr_members
307 Show number of members.
308
309
310 -N, --class_name_len
311 Show size of classes.
312
313
314 -O, --dwarf_offset=OFFSET
315 Show tag with DWARF OFFSET.
316
317
318 -P, --packable
319 Show only structs that has holes that can be packed if members
320 are reorganized, for instance when using the --reorganize op‐
321 tion.
322
323
324 -P, --with_flexible_array
325 Show only structs that have a flexible array.
326
327
328 -q, --quiet
329 Be quieter.
330
331
332 -s, --sizes
333 Show size of classes.
334
335
336 -t, --separator=SEP
337 Use SEP as the field separator.
338
339
340 -T, --nr_definitions
341 Show how many times struct was defined.
342
343
344 -u, --defined_in
345 Show CUs where CLASS_NAME (-C) is defined.
346
347
348 --flat_arrays
349 Flatten arrays, so that array[10][2] becomes array[20]. Useful
350 when generating from both CTF/BTF and DWARF encodings for the
351 same binary for testing purposes.
352
353
354 --suppress_aligned_attribute
355 Suppress forced alignment markers, so that one can compare BTF
356 or CTF output, that don't have that info, to output from DWARF
357 >= 5.
358
359
360 --suppress_force_paddings
361
362 Suppress bitfield forced padding at the end of structs, as this
363 requires something like DWARF's DW_AT_alignment, so that one can
364 compare BTF or CTF output, that don't have that info.
365
366
367 --suppress_packed
368
369 Suppress the output of the inference of __attri‐
370 bute__((__packed__)), so that one can compare BTF or CTF output,
371 the inference algorithm uses things like DW_AT_alignment, so un‐
372 til it is improved to infer that as well for BTF, allow dis‐
373 abling this output.
374
375
376 --fixup_silly_bitfields
377 Converts silly bitfields such as "int foo:32" to plain "int
378 foo".
379
380
381 -V, --verbose
382 be verbose
383
384
385 --ptr_table_stats
386 Print statistics about ptr_table data structures, used to hold
387 all the types, tags and functions data structures, for develop‐
388 ment tuning of such tables, tuned for a typical 2021 vmlinux
389 file.
390
391
392 -w, --word_size=WORD_SIZE
393 Change the arch word size to WORD_SIZE.
394
395
396 -x, --exclude=PREFIX
397 Exclude PREFIXed classes.
398
399
400 -X, --cu_exclude=PREFIX
401 Exclude PREFIXed compilation units.
402
403
404 --lang=languages
405 Only process compilation units built from source code written in
406 the specified languages.
407
408 Supported languages:
409
410 ada83, ada95, asm, bliss, c, c89, c99, c11, c++, c++03, c++11,
411 c++14, cobol74,
412 cobol85, d, dylan, fortran77, fortran90, fortran95, fortran03,
413 fortran08,
414 go, haskell, java, julia, modula2, modula3, objc, objc++,
415 ocaml, opencl,
416 pascal83, pli, python, renderscript, rust, swift, upc
417
418 The linux kernel, for instance, is written in 'c89' circa 2022,
419 use that in filters.
420
421 --lang_exclude=languages Don't process compilation units built
422 from source code written in the specified languages.
423
424 To filter out compilation units written in Rust, for instance,
425 use:
426
427 pahole -j --btf_encode --lang_exclude rust
428
429
430 -y, --prefix_filter=PREFIX
431 Include PREFIXed classes.
432
433
434 -z, --hole_size_ge=HOLE_SIZE
435 Show only structs with at least one hole greater or equal to
436 HOLE_SIZE.
437
438
439 --structs
440 Show only structs, all the other filters apply, i.e. to show
441 just the sizes of all structs combine --structs with --sizes,
442 etc.
443
444
445 --packed
446 Show only packed structs, all the other filters apply, i.e. to
447 show just the sizes of all packed structs combine --packed with
448 --sizes, etc.
449
450
451 --unions
452 Show only unions, all the other filters apply, i.e. to show just
453 the sizes of all unions combine --union with --sizes, etc.
454
455
456 --version
457 Show a traditional string version, i.e.: "v1.18".
458
459
460 --numeric_version
461 Show a numeric only version, suitable for use in Makefiles and
462 scripts where one wants to know what if the installed version
463 has some feature, i.e.: 118 instead of "v1.18".
464
465
466 --kabi_prefix=STRING
467 When the prefix of the string is STRING, treat the string as
468 STRING.
469
470
472 To enable the generation of debugging information in the Linux kernel
473 build process select CONFIG_DEBUG_INFO. This can be done using make
474 menuconfig by this path: "Kernel Hacking" -> "Compile-time checks and
475 compiler options" -> "Compile the kernel with debug info". Consider as
476 well enabling CONFIG_DEBUG_INFO_BTF by going thru the aforementioned
477 menuconfig path and then selecting "Generate BTF typeinfo". Most modern
478 distributions with eBPF support should come with that in all its ker‐
479 nels, greatly facilitating the use of pahole.
480
481 Many distributions also come with debuginfo packages, so just enable it
482 in your package manager repository configuration and install the ker‐
483 nel-debuginfo, or any other userspace program written in a language
484 that the compiler generates debuginfo (C, C++, for instance).
485
486
488 All the examples here use either /sys/kernel/btf/vmlinux, if present,
489 or lookup a vmlinux file matching the running kernel, using the build-
490 id info found in /sys/kernel/notes to make sure it matches.
491
492 Show a type:
493
494 $ pahole -C __u64
495 typedef long long unsigned int __u64;
496 $
497
498
499 Works as well if the only argument is a type name:
500
501 $ pahole raw_spinlock_t
502 typedef struct raw_spinlock raw_spinlock_t;
503 $
504
505
506 Multiple types can be passed, separated by commas:
507
508 $ pahole raw_spinlock_t,raw_spinlock
509 struct raw_spinlock {
510 arch_spinlock_t raw_lock; /* 0 4 */
511
512 /* size: 4, cachelines: 1, members: 1 */
513 /* last cacheline: 4 bytes */
514 };
515 typedef struct raw_spinlock raw_spinlock_t;
516 $
517
518
519 Types can be expanded:
520
521 $ pahole -E raw_spinlock
522 struct raw_spinlock {
523 /* typedef arch_spinlock_t */ struct qspinlock {
524 union {
525 /* typedef atomic_t */ struct {
526 int counter; /* 0 4 */
527 } val; /* 0 4 */
528 struct {
529 /* typedef u8 -> __u8 */ unsigned char locked; /* 0 1 */
530 /* typedef u8 -> __u8 */ unsigned char pending; /* 1 1 */
531 }; /* 0 2 */
532 struct {
533 /* typedef u16 -> __u16 */ short unsigned int locked_pending; /* 0 2 */
534 /* typedef u16 -> __u16 */ short unsigned int tail; /* 2 2 */
535 }; /* 0 4 */
536 }; /* 0 4 */
537 } raw_lock; /* 0 4 */
538
539 /* size: 4, cachelines: 1, members: 1 */
540 /* last cacheline: 4 bytes */
541 };
542 $
543
544
545 When decoding OOPSes you may want to see the offsets and sizes in hexa‐
546 decimal:
547
548 $ pahole --hex thread_struct
549 struct thread_struct {
550 struct desc_struct tls_array[3]; /* 0 0x18 */
551 long unsigned int sp; /* 0x18 0x8 */
552 short unsigned int es; /* 0x20 0x2 */
553 short unsigned int ds; /* 0x22 0x2 */
554 short unsigned int fsindex; /* 0x24 0x2 */
555 short unsigned int gsindex; /* 0x26 0x2 */
556 long unsigned int fsbase; /* 0x28 0x8 */
557 long unsigned int gsbase; /* 0x30 0x8 */
558 struct perf_event * ptrace_bps[4]; /* 0x38 0x20 */
559 /* --- cacheline 1 boundary (64 bytes) was 24 bytes ago --- */
560 long unsigned int debugreg6; /* 0x58 0x8 */
561 long unsigned int ptrace_dr7; /* 0x60 0x8 */
562 long unsigned int cr2; /* 0x68 0x8 */
563 long unsigned int trap_nr; /* 0x70 0x8 */
564 long unsigned int error_code; /* 0x78 0x8 */
565 /* --- cacheline 2 boundary (128 bytes) --- */
566 struct io_bitmap * io_bitmap; /* 0x80 0x8 */
567 long unsigned int iopl_emul; /* 0x88 0x8 */
568 mm_segment_t addr_limit; /* 0x90 0x8 */
569 unsigned int sig_on_uaccess_err:1; /* 0x98: 0 0x4 */
570 unsigned int uaccess_err:1; /* 0x98:0x1 0x4 */
571
572 /* XXX 30 bits hole, try to pack */
573 /* XXX 36 bytes hole, try to pack */
574
575 /* --- cacheline 3 boundary (192 bytes) --- */
576 struct fpu fpu; /* 0xc0 0x1040 */
577
578 /* size: 4352, cachelines: 68, members: 20 */
579 /* sum members: 4312, holes: 1, sum holes: 36 */
580 /* sum bitfield members: 2 bits, bit holes: 1, sum bit holes: 30 bits */
581 };
582 $
583
584
585 OK, I know the offset that causes its a 'struct thread_struct' and that
586 the offset is 0x178, so must be in that 'fpu' struct... No problem, ex‐
587 pand 'struct thread_struct' and combine with grep:
588
589 $ pahole --hex -E thread_struct | egrep '(0x178|struct fpu)' -B4 -A4
590 /* XXX 30 bits hole, try to pack */
591 /* XXX 36 bytes hole, try to pack */
592
593 /* --- cacheline 3 boundary (192 bytes) --- */
594 struct fpu {
595 unsigned int last_cpu; /* 0xc0 0x4 */
596
597 /* XXX 4 bytes hole, try to pack */
598
599 --
600 /* typedef u8 -> __u8 */ unsigned char alimit; /* 0x171 0x1 */
601
602 /* XXX 6 bytes hole, try to pack */
603
604 struct math_emu_info * info; /* 0x178 0x8 */
605 /* --- cacheline 6 boundary (384 bytes) --- */
606 /* typedef u32 -> __u32 */ unsigned int entry_eip; /* 0x180 0x4 */
607 } soft; /* 0x100 0x88 */
608 struct xregs_state {
609 $
610
611
612 Want to know where 'struct thread_struct' is defined in the kernel
613 sources?
614
615 $ pahole -I thread_struct | head -2
616 /* Used at: /sys/kernel/btf/vmlinux */
617 /* <0> (null):0 */
618 $
619
620
621 Not present in BTF, so use DWARF, takes a little bit longer, and assum‐
622 ing it finds the matching vmlinux file:
623
624 $ pahole -Fdwarf -I thread_struct | head -2
625 /* Used at: /home/acme/git/linux/arch/x86/kernel/head64.c */
626 /* <3333> /home/acme/git/linux/arch/x86/include/asm/processor.h:485 */
627 $
628
629
630 To find the biggest data structures in the Linux kernel:
631
632 $ pahole -s | sort -k2 -nr | head -5
633 cmp_data 290904 1
634 dec_datas 274520 1
635 cpu_entry_area 217088 0
636 pglist_data 172928 4
637 saved_cmdlines_buffer 131104 1
638 $
639
640 The second column is the size in bytes and the third is the number of
641 alignment holes in that structure.
642
643 Show data structures that have a raw spinlock and are related to the
644 RCU mechanism:
645
646 $ pahole --contains raw_spinlock_t --prefix rcu
647 rcu_node
648 rcu_data
649 rcu_state
650 $
651
652 To see that in context, combine it with grep:
653
654 $ pahole rcu_state | grep raw_spinlock_t -B1 -A5
655 /* --- cacheline 52 boundary (3328 bytes) --- */
656 raw_spinlock_t ofl_lock; /* 3328 4 */
657
658 /* size: 3392, cachelines: 53, members: 35 */
659 /* sum members: 3250, holes: 7, sum holes: 82 */
660 /* padding: 60 */
661 };
662 $
663
664
666 pahole can also use the data structure types to pretty print raw data
667 specified via --prettify. To consume raw data from the standard input,
668 just use '--prettify -'
669
670 It can also pretty print raw data from stdin according to the type
671 specified:
672
673 $ pahole -C modversion_info drivers/scsi/sg.ko
674 struct modversion_info {
675 long unsigned int crc; /* 0 8 */
676 char name[56]; /* 8 56 */
677
678 /* size: 64, cachelines: 1, members: 2 */
679 };
680 $
681 $ objcopy -O binary --only-section=__versions drivers/scsi/sg.ko versions
682 $
683 $ ls -la versions
684 -rw-rw-r--. 1 acme acme 7616 Jun 25 11:33 versions
685 $
686 $ pahole --count 3 -C modversion_info drivers/scsi/sg.ko --prettify versions
687 {
688 .crc = 0x8dabd84,
689 .name = "module_layout",
690 },
691 {
692 .crc = 0x45e4617b,
693 .name = "no_llseek",
694 },
695 {
696 .crc = 0xa23fae8c,
697 .name = "param_ops_int",
698 },
699 $
700 $ pahole --skip 1 --count 2 -C modversion_info drivers/scsi/sg.ko --prettify - < versions
701 {
702 .crc = 0x45e4617b,
703 .name = "no_llseek",
704 },
705 {
706 .crc = 0xa23fae8c,
707 .name = "param_ops_int",
708 },
709 $
710 This is equivalent to:
711
712 $ pahole --seek_bytes 64 --count 1 -C modversion_info drivers/scsi/sg.ko --prettify versions
713 {
714 .crc = 0x45e4617b,
715 .name = "no_llseek",
716 },
717 $
718
719 -C, --class_name=CLASS_NAME
720 Pretty print according to this class. Arguments may be passed to
721 it to affect how the pretty printing is performed, e.g.:
722
723
724 -C 'perf_event_header(sizeof,type,type_enum=perf_event_type,filter=type==PERF_RECORD_EXIT)'
725
726 This would select the 'struct perf_event_header' as the type to use to
727 pretty print records states that the 'size' field in that struct should
728 be used to figure out the size of the record (variable sized records),
729 that the 'enum perf_event_type' should be used to pretty print the nu‐
730 meric value in perf_event_header->type and furthermore that it should
731 be used to heuristically look for structs with the same name (lower‐
732 case) of the enum entry that is converted from the type field, using it
733 to pretty print instead of the base 'perf_event_header' type. See the
734 PRETTY PRINTING EXAMPLES section below.
735
736 Furthermore the 'filter=' part can be used, so far with only the '=='
737 operator to filter based on the 'type' field and converting the string
738 'PERF_RECORD_EXIT' to a number according to type_enum.
739
740 The 'sizeof' arg defaults to the 'size' member name, if the name is
741 different, one can use
742 'sizeof=sz' form, ditto for 'type=other_member_name' field, that de‐
743 faults to 'type'.
744
745
747 Looking at the ELF header for a vmlinux file, using BTF, first lets
748 discover the ELF header type:
749
750 $ pahole --sizes | grep -i elf | grep -i _h
751 elf64_hdr 64 0
752 elf32_hdr 52 0
753 $
754
755 Now we can use this to show the first record from offset zero:
756
757 $ pahole -C elf64_hdr --count 1 --prettify /lib/modules/5.8.0-rc3+/build/vmlinux
758 {
759 .e_ident = { 127, 69, 76, 70, 2, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
760 .e_type = 2,
761 .e_machine = 62,
762 .e_version = 1,
763 .e_entry = 16777216,
764 .e_phoff = 64,
765 .e_shoff = 775923840,
766 .e_flags = 0,
767 .e_ehsize = 64,
768 .e_phentsize = 56,
769 .e_phnum = 5,
770 .e_shentsize = 64,
771 .e_shnum = 80,
772 .e_shstrndx = 79,
773 },
774 $
775
776 This is equivalent to:
777
778 $ pahole --header elf64_hdr --prettify /lib/modules/5.8.0-rc3+/build/vmlinux
779
780 The --header option also allows reference in other command line options
781 to fields in the header. This is useful when one wants to show multi‐
782 ple records in a file and the range where those fields are located is
783 specified in header fields, such as for perf.data files:
784
785 $ pahole --hex ~/bin/perf --header perf_file_header --prettify perf.data
786 {
787 .magic = 0x32454c4946524550,
788 .size = 0x68,
789 .attr_size = 0x88,
790 .attrs = {
791 .offset = 0xa8,
792 .size = 0x88,
793 },
794 .data = {
795 .offset = 0x130,
796 .size = 0x588,
797 },
798 .event_types = {
799 .offset = 0,
800 .size = 0,
801 },
802 .adds_features = { 0x16717ffc, 0, 0, 0 },
803 },
804 $
805
806 So to display the cgroups records in the perf_file_header.data section
807 we can use:
808
809 $ pahole ~/bin/perf --header=perf_file_header --seek_bytes '$header.data.offset' --size_bytes='$header.data.size' -C 'perf_event_header(sizeof,type,type_enum=perf_event_type,filter=type==PERF_RECORD_CGROUP)' --prettify perf.data
810 {
811 .header = {
812 .type = PERF_RECORD_CGROUP,
813 .misc = 0,
814 .size = 40,
815 },
816 .id = 1,
817 .path = "/",
818 },
819 {
820 .header = {
821 .type = PERF_RECORD_CGROUP,
822 .misc = 0,
823 .size = 48,
824 },
825 .id = 1553,
826 .path = "/system.slice",
827 },
828 {
829 .header = {
830 .type = PERF_RECORD_CGROUP,
831 .misc = 0,
832 .size = 48,
833 },
834 .id = 8,
835 .path = "/machine.slice",
836 },
837 {
838 .header = {
839 .type = PERF_RECORD_CGROUP,
840 .misc = 0,
841 .size = 128,
842 },
843 .id = 7828,
844 .path = "/machine.slice/libpod-42be8e8d4eb9d22405845005f0d04ea398548dccc934a150fbaa3c1f1f9492c2.scope",
845 },
846 {
847 .header = {
848 .type = PERF_RECORD_CGROUP,
849 .misc = 0,
850 .size = 88,
851 },
852 .id = 13,
853 .path = "/machine.slice/machine-qemu\x2d1\x2drhel6.sandy.scope",
854 },
855 $
856
857 For the common case of the header having a member that has the 'offset'
858 and 'size' members, it is possible to use this more compact form:
859
860 $ pahole ~/bin/perf --header=perf_file_header --range=data -C 'perf_event_header(sizeof,type,type_enum=perf_event_type,filter=type==PERF_RECORD_CGROUP)' --prettify perf.data
861
862 This uses ~/bin/perf to get the type definitions, the defines 'struct
863 perf_file_header' as the header, then seeks '$header.data.offset' bytes
864 from the start of the file, and considers '$header.data.size' bytes
865 worth of such records. The filter expression may omit a common prefix,
866 in this case it could additionally be equivalently written as both
867 'filter=type==CGROUP' or the 'filter=' can also be omitted, getting as
868 compact as 'type==CGROUP':
869
870 If we look at:
871
872 $ pahole ~/bin/perf -C perf_event_header
873 struct perf_event_header {
874 __u32 type; /* 0 4 */
875 __u16 misc; /* 4 2 */
876 __u16 size; /* 6 2 */
877
878 /* size: 8, cachelines: 1, members: 3 */
879 /* last cacheline: 8 bytes */
880 };
881 $
882
883 And:
884
885 $ pahole ~/bin/perf -C perf_event_type
886 enum perf_event_type {
887 PERF_RECORD_MMAP = 1,
888 PERF_RECORD_LOST = 2,
889 PERF_RECORD_COMM = 3,
890 PERF_RECORD_EXIT = 4,
891 PERF_RECORD_THROTTLE = 5,
892 PERF_RECORD_UNTHROTTLE = 6,
893 PERF_RECORD_FORK = 7,
894 PERF_RECORD_READ = 8,
895 PERF_RECORD_SAMPLE = 9,
896 PERF_RECORD_MMAP2 = 10,
897 PERF_RECORD_AUX = 11,
898 PERF_RECORD_ITRACE_START = 12,
899 PERF_RECORD_LOST_SAMPLES = 13,
900 PERF_RECORD_SWITCH = 14,
901 PERF_RECORD_SWITCH_CPU_WIDE = 15,
902 PERF_RECORD_NAMESPACES = 16,
903 PERF_RECORD_KSYMBOL = 17,
904 PERF_RECORD_BPF_EVENT = 18,
905 PERF_RECORD_CGROUP = 19,
906 PERF_RECORD_TEXT_POKE = 20,
907 PERF_RECORD_MAX = 21,
908 };
909 $
910
911 And furthermore:
912
913 $ pahole ~/bin/perf -C perf_record_cgroup
914 struct perf_record_cgroup {
915 struct perf_event_header header; /* 0 8 */
916 __u64 id; /* 8 8 */
917 char path[4096]; /* 16 4096 */
918
919 /* size: 4112, cachelines: 65, members: 3 */
920 /* last cacheline: 16 bytes */
921 };
922 $
923
924 Then we can see how the perf_event_header.type could be converted from
925 a __u32 to a string (PERF_RECORD_CGROUP). If we remove that
926 type_enum=perf_event_type, we will lose the conversion of 'struct
927 perf_event_header' to the more descriptive 'struct perf_record_cgroup',
928 and also the beautification of the header.type field:
929
930 $ pahole ~/bin/perf --header=perf_file_header --seek_bytes '$header.data.offset' --size_bytes='$header.data.size' -C 'perf_event_header(sizeof,type,filter=type==19)' --prettify perf.data
931 {
932 .type = 19,
933 .misc = 0,
934 .size = 40,
935 },
936 {
937 .type = 19,
938 .misc = 0,
939 .size = 48,
940 },
941 {
942 .type = 19,
943 .misc = 0,
944 .size = 48,
945 },
946 {
947 .type = 19,
948 .misc = 0,
949 .size = 128,
950 },
951 {
952 .type = 19,
953 .misc = 0,
954 .size = 88,
955 },
956 $
957
958 Some of the records are not found in 'type_enum=perf_event_type' so
959 some of the records don't get converted to a type that fully shows its
960 contents. For perf we know that those are in another enumeration, 'enum
961 perf_user_event_type', so, for these cases, we can create a 'virtual
962 enum', i.e. the sum of two enums and then get all those entries decoded
963 and properly casted, first few records with just 'enum
964 perf_event_type':
965
966 $ pahole ~/bin/perf --header=perf_file_header --seek_bytes '$header.data.offset' --size_bytes='$header.data.size' -C 'perf_event_header(sizeof,type,type_enum=perf_event_type)' --count 4 --prettify perf.data
967 {
968 .type = 79,
969 .misc = 0,
970 .size = 32,
971 },
972 {
973 .type = 73,
974 .misc = 0,
975 .size = 40,
976 },
977 {
978 .type = 74,
979 .misc = 0,
980 .size = 32,
981 },
982 {
983 .header = {
984 .type = PERF_RECORD_CGROUP,
985 .misc = 0,
986 .size = 40,
987 },
988 .id = 1,
989 .path = "/",
990 },
991 $
992
993 Now with both enumerations, i.e. with
994 'type_enum=perf_event_type+perf_user_event_type':
995
996 $ pahole ~/bin/perf --header=perf_file_header --seek_bytes '$header.data.offset' --size_bytes='$header.data.size' -C 'perf_event_header(sizeof,type,type_enum=perf_event_type+perf_user_event_type)' --count 5 --prettify perf.data
997 {
998 .header = {
999 .type = PERF_RECORD_TIME_CONV,
1000 .misc = 0,
1001 .size = 32,
1002 },
1003 .time_shift = 31,
1004 .time_mult = 1016803377,
1005 .time_zero = 435759009518382,
1006 },
1007 {
1008 .header = {
1009 .type = PERF_RECORD_THREAD_MAP,
1010 .misc = 0,
1011 .size = 40,
1012 },
1013 .nr = 1,
1014 .entries = 0x50 0x7e 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00,
1015 },
1016 {
1017 .header = {
1018 .type = PERF_RECORD_CPU_MAP,
1019 .misc = 0,
1020 .size = 32,
1021 },
1022 .data = {
1023 .type = 1,
1024 .data = "",
1025 },
1026 },
1027 {
1028 .header = {
1029 .type = PERF_RECORD_CGROUP,
1030 .misc = 0,
1031 .size = 40,
1032 },
1033 .id = 1,
1034 .path = "/",
1035 },
1036 {
1037 .header = {
1038 .type = PERF_RECORD_CGROUP,
1039 .misc = 0,
1040 .size = 48,
1041 },
1042 .id = 1553,
1043 .path = "/system.slice",
1044 },
1045 $
1046
1047 It is possible to pass multiple types, one has only to make sure they
1048 appear in the file in sequence, i.e. for the perf.data example, see the
1049 perf_file_header dump above, one can print the perf_file_attr structs
1050 in the header attrs range, then the perf_event_header in the data range
1051 with the following command:
1052
1053 pahole ~/bin/perf --header=perf_file_header -C 'perf_file_attr(range=attrs),perf_event_header(range=data,sizeof,type,type_enum=perf_event_type+perf_user_event_type)' --prettify perf.data
1054
1055
1057 eu-readelf(1), readelf(1), objdump(1).
1058
1059 https://www.kernel.org/doc/ols/2007/ols2007v2-pages-35-44.pdf.
1060
1062 pahole was written and is maintained by Arnaldo Carvalho de Melo
1063 <acme@kernel.org>.
1064
1065 Thanks to Andrii Nakryiko and Martin KaFai Lau for providing the BTF
1066 encoder and improving the codebase while making sure the BTF encoder
1067 works as needed to be used in encoding the Linux kernel .BTF section
1068 from the DWARF info generated by gcc. For that Andrii wrote a BTF dedu‐
1069 plicator in libbpf that is used by pahole.
1070
1071 Also thanks to Conectiva, Mandriva and Red Hat for allowing me to work
1072 on these tools.
1073
1074 Please send bug reports to <dwarves@vger.kernel.org>.
1075
1076 No subscription is required.
1077
1078
1079
1080dwarves January 16, 2020 pahole(1)