1pahole(1) dwarves pahole(1)
2
3
4
6 pahole - Shows, manipulates data structure layout and pretty prints raw
7 data.
8
10 pahole [options] files
11
13 pahole shows data structure layouts encoded in debugging information
14 formats, DWARF, CTF and BTF being supported.
15
16 This is useful for, among other things: optimizing important data
17 structures by reducing its size, figuring out what is the field sitting
18 at an offset from the start of a data structure, investigating ABI
19 changes and more generally understanding a new codebase you have to
20 work with.
21
22 It also uses these structure layouts to pretty print data feed to its
23 standard input, e.g.:
24
25 $ pahole --header elf64_hdr --prettify /lib/modules/5.8.0-rc6+/build/vmlinux
26 {
27 .e_ident = { 127, 69, 76, 70, 2, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
28 .e_type = 2,
29 .e_machine = 62,
30 .e_version = 1,
31 .e_entry = 16777216,
32 .e_phoff = 64,
33 .e_shoff = 604653784,
34 .e_flags = 0,
35 .e_ehsize = 64,
36 .e_phentsize = 56,
37 .e_phnum = 5,
38 .e_shentsize = 64,
39 .e_shnum = 80,
40 .e_shstrndx = 79,
41 },
42 $
43
44 See the PRETTY PRINTING section for further examples and documentation.
45
46 The files must have associated debugging information. This information
47 may be inside the file itself, in ELF sections, or in another file.
48
49 One way to have this information is to specify the -g option to the
50 compiler when building it. When this is done the information will be
51 stored in an ELF section. For the DWARF debugging information format
52 this, adds, among others, the .debug_info ELF section. For CTF it is
53 found in just one ELF section, .SUNW_ctf. BTF comes in at least the
54 .BTF ELF section, and may come also with the .BTF.ext ELF section.
55
56 The debuginfo packages available in most Linux distributions are also
57 supported by pahole, where the debugging information is available in a
58 separate file.
59
60 By default, pahole shows the layout of all named structs in the files
61 specified.
62
63 If no files are specified, then it will look if the /sys/kernel/btf/vm‐
64 linux is present, using the BTF information present in it about the
65 running kernel, i.e. this works:
66
67 $ pahole list_head
68 struct list_head {
69 struct list_head * next; /* 0 8 */
70 struct list_head * prev; /* 8 8 */
71
72 /* size: 16, cachelines: 1, members: 2 */
73 /* last cacheline: 16 bytes */
74 };
75 $
76
77 If BTF is not present and no file is passed, then a vmlinux that
78 matches the build-id for the running kernel will be looked up in the
79 usual places, including where the kernel debuginfo packages put it,
80 looking for DWARF info instead.
81
82 See the EXAMPLES section for more usage suggestions.
83
84 It also pretty prints whatever is fed to its standard input, according
85 to the type specified, see the EXAMPLE session.
86
87 Use --count to state how many records should be pretty printed.
88
89
91 pahole supports the following options.
92
93
94 -C, --class_name=CLASS_NAMES
95 Show just these classes. This can be a comma separated list of
96 class names or file URLs (e.g.: file://class_list.txt)
97
98
99 -c, --cacheline_size=SIZE
100 Set cacheline size to SIZE bytes.
101
102
103 --sort Sort the output by type name, maybe this will grow to allow
104 sorting by other criteria.
105
106 This is mostly needed so that pretty printing from BTF and DWARF
107 can be comparable when using using multiple threads to load
108 DWARF data, when the order that the types in the compile units
109 is processed is not deterministic.
110
111
112 --count=COUNT
113 Pretty print the first COUNT records from input.
114
115
116 --skip=COUNT
117 Skip COUNT input records.
118
119
120 -E, --expand_types
121 Expand class members. Useful to find in what member of inner
122 structs where an offset from the beginning of a struct is.
123
124
125 -F, --format_path
126 Allows specifying a list of debugging formats to try, in order.
127 Right now this includes "ctf" and "dwarf". The default format
128 path used is equivalent to "-F dwarf,ctf".
129
130
131 --hashbits=BITS
132 Allows specifying the number of bits for the debugging format
133 loader to use. The only one affected so far is the "dwarf" one,
134 its default now is 15, the maximum for it is now 21 bits. Tweak
135 it to see if it improves performance as the kernel evolves and
136 more types and functions have to be loaded.
137
138
139 --hex Print offsets and sizes in hexadecimal.
140
141
142 -r, --rel_offset
143 Show relative offsets of members in inner structs.
144
145
146 -p, --expand_pointers
147 Expand class pointer members.
148
149
150 -R, --reorganize
151 Reorganize struct, demoting and combining bitfields, moving mem‐
152 bers to remove alignment holes and padding.
153
154
155 -S, --show_reorg_steps
156 Show the struct layout at each reorganization step.
157
158
159 -i, --contains=CLASS_NAME
160 Show classes that contains CLASS_NAME.
161
162
163 -a, --anon_include
164 Include anonymous classes.
165
166
167 -A, --nested_anon_include
168 Include nested (inside other structs) anonymous classes.
169
170
171 -B, --bit_holes=NR_HOLES
172 Show only structs at least NR_HOLES bit holes.
173
174
175 -d, --recursive
176 Recursive mode, affects several other flags.
177
178
179 -D, --decl_exclude=PREFIX
180 exclude classes declared in files with PREFIX.
181
182
183 -f, --find_pointers_to=CLASS_NAME
184 Find pointers to CLASS_NAME.
185
186
187 -H, --holes=NR_HOLES
188 Show only structs with at least NR_HOLES holes.
189
190
191 -I, --show_decl_info
192 Show the file and line number where the tags were defined, if
193 available in the debugging information.
194
195
196 --skip_encoding_btf_vars
197 Do not encode VARs in BTF.
198
199
200 --skip_encoding_btf_decl_tag
201 Do not encode decl tags in BTF.
202
203
204 --skip_encoding_btf_type_tag
205 Do not encode type tags in BTF.
206
207
208 -j, --jobs=N
209 Run N jobs in parallel. Defaults to number of online processors
210 + 10% (like the 'ninja' build system) if no argument is speci‐
211 fied.
212
213
214 -J, --btf_encode
215 Encode BTF information from DWARF, used in the Linux kernel
216 build process when CONFIG_DEBUG_INFO_BTF=y is present, intro‐
217 duced in Linux v5.2. Used to implement features such as BPF CO-
218 RE (Compile Once - Run Everywhere).
219
220 See https://nakryiko.com/posts/bpf-portability-and-co-re/.
221
222
223 --btf_encode_detached=FILENAME
224 Same thing as -J/--btf_encode, but storing the raw BTF info into
225 a separate file.
226
227
228 --btf_encode_force
229 Ignore those symbols found invalid when encoding BTF.
230
231
232 --btf_base=PATH
233 Path to the base BTF file, for instance: vmlinux when encoding
234 kernel module BTF information. This may be inferred when asking
235 for a /sys/kernel/btf/MODULE, when it will be autoconfigured to
236 "/sys/kernel/btf/vmlinux".
237
238
239 --btf_gen_floats
240 Allow producing BTF_KIND_FLOAT entries in systems where the vm‐
241 linux DWARF information has float types.
242
243
244 --btf_gen_all
245 Allow using all the BTF features supported by pahole.
246
247
248 -l, --show_first_biggest_size_base_type_member
249 Show first biggest size base_type member.
250
251
252 -m, --nr_methods
253 Show number of methods of all classes, i.e. the number of func‐
254 tions have arguments that are pointers to a given class.
255
256 To get the number of methods for an specific class, please use:
257
258 $ pahole --nr_methods | grep -w sock
259 sock 1005
260 $
261
262 In the above example it used the BTF information in /sys/ker‐
263 nel/btf/vmlinux.
264
265
266 -M, --show_only_data_members
267 Show only the members that use space in the class layout. C++
268 methods will be suppressed.
269
270
271 -n, --nr_members
272 Show number of members.
273
274
275 -N, --class_name_len
276 Show size of classes.
277
278
279 -O, --dwarf_offset=OFFSET
280 Show tag with DWARF OFFSET.
281
282
283 -P, --packable
284 Show only structs that has holes that can be packed if members
285 are reorganized, for instance when using the --reorganize op‐
286 tion.
287
288
289 -P, --with_flexible_array
290 Show only structs that have a flexible array.
291
292
293 -q, --quiet
294 Be quieter.
295
296
297 -s, --sizes
298 Show size of classes.
299
300
301 -t, --separator=SEP
302 Use SEP as the field separator.
303
304
305 -T, --nr_definitions
306 Show how many times struct was defined.
307
308
309 -u, --defined_in
310 Show CUs where CLASS_NAME (-C) is defined.
311
312
313 --flat_arrays
314 Flatten arrays, so that array[10][2] becomes array[20]. Useful
315 when generating from both CTF/BTF and DWARF encodings for the
316 same binary for testing purposes.
317
318
319 --suppress_aligned_attribute
320 Suppress forced alignment markers, so that one can compare BTF
321 or CTF output, that don't have that info, to output from DWARF
322 >= 5.
323
324
325 --suppress_force_paddings
326
327 Suppress bitfield forced padding at the end of structs, as this
328 requires something like DWARF's DW_AT_alignment, so that one can
329 compare BTF or CTF output, that don't have that info.
330
331
332 --suppress_packed
333
334 Suppress the output of the inference of __attri‐
335 bute__((__packed__)), so that one can compare BTF or CTF output,
336 the inference algorithm uses things like DW_AT_alignment, so un‐
337 til it is improved to infer that as well for BTF, allow dis‐
338 abling this output.
339
340
341 --fixup_silly_bitfields
342 Converts silly bitfields such as "int foo:32" to plain "int
343 foo".
344
345
346 -V, --verbose
347 be verbose
348
349
350 --ptr_table_stats
351 Print statistics about ptr_table data structures, used to hold
352 all the types, tags and functions data structures, for develop‐
353 ment tuning of such tables, tuned for a typical 2021 vmlinux
354 file.
355
356
357 -w, --word_size=WORD_SIZE
358 Change the arch word size to WORD_SIZE.
359
360
361 -x, --exclude=PREFIX
362 Exclude PREFIXed classes.
363
364
365 -X, --cu_exclude=PREFIX
366 Exclude PREFIXed compilation units.
367
368
369 -y, --prefix_filter=PREFIX
370 Include PREFIXed classes.
371
372
373 -z, --hole_size_ge=HOLE_SIZE
374 Show only structs with at least one hole greater or equal to
375 HOLE_SIZE.
376
377
378 --structs
379 Show only structs, all the other filters apply, i.e. to show
380 just the sizes of all structs combine --structs with --sizes,
381 etc.
382
383
384 --packed
385 Show only packed structs, all the other filters apply, i.e. to
386 show just the sizes of all packed structs combine --packed with
387 --sizes, etc.
388
389
390 --unions
391 Show only unions, all the other filters apply, i.e. to show just
392 the sizes of all unions combine --union with --sizes, etc.
393
394
395 --version
396 Show a traditional string version, i.e.: "v1.18".
397
398
399 --numeric_version
400 Show a numeric only version, suitable for use in Makefiles and
401 scripts where one wants to know what if the installed version
402 has some feature, i.e.: 118 instead of "v1.18".
403
404
405 --kabi_prefix=STRING
406 When the prefix of the string is STRING, treat the string as
407 STRING.
408
409
411 To enable the generation of debugging information in the Linux kernel
412 build process select CONFIG_DEBUG_INFO. This can be done using make
413 menuconfig by this path: "Kernel Hacking" -> "Compile-time checks and
414 compiler options" -> "Compile the kernel with debug info". Consider as
415 well enabling CONFIG_DEBUG_INFO_BTF by going thru the aforementioned
416 menuconfig path and then selecting "Generate BTF typeinfo". Most modern
417 distributions with eBPF support should come with that in all its ker‐
418 nels, greatly facilitating the use of pahole.
419
420 Many distributions also come with debuginfo packages, so just enable it
421 in your package manager repository configuration and install the ker‐
422 nel-debuginfo, or any other userspace program written in a language
423 that the compiler generates debuginfo (C, C++, for instance).
424
425
427 All the examples here use either /sys/kernel/btf/vmlinux, if present,
428 or lookup a vmlinux file matching the running kernel, using the build-
429 id info found in /sys/kernel/notes to make sure it matches.
430
431 Show a type:
432
433 $ pahole -C __u64
434 typedef long long unsigned int __u64;
435 $
436
437
438 Works as well if the only argument is a type name:
439
440 $ pahole raw_spinlock_t
441 typedef struct raw_spinlock raw_spinlock_t;
442 $
443
444
445 Multiple types can be passed, separated by commas:
446
447 $ pahole raw_spinlock_t,raw_spinlock
448 struct raw_spinlock {
449 arch_spinlock_t raw_lock; /* 0 4 */
450
451 /* size: 4, cachelines: 1, members: 1 */
452 /* last cacheline: 4 bytes */
453 };
454 typedef struct raw_spinlock raw_spinlock_t;
455 $
456
457
458 Types can be expanded:
459
460 $ pahole -E raw_spinlock
461 struct raw_spinlock {
462 /* typedef arch_spinlock_t */ struct qspinlock {
463 union {
464 /* typedef atomic_t */ struct {
465 int counter; /* 0 4 */
466 } val; /* 0 4 */
467 struct {
468 /* typedef u8 -> __u8 */ unsigned char locked; /* 0 1 */
469 /* typedef u8 -> __u8 */ unsigned char pending; /* 1 1 */
470 }; /* 0 2 */
471 struct {
472 /* typedef u16 -> __u16 */ short unsigned int locked_pending; /* 0 2 */
473 /* typedef u16 -> __u16 */ short unsigned int tail; /* 2 2 */
474 }; /* 0 4 */
475 }; /* 0 4 */
476 } raw_lock; /* 0 4 */
477
478 /* size: 4, cachelines: 1, members: 1 */
479 /* last cacheline: 4 bytes */
480 };
481 $
482
483
484 When decoding OOPSes you may want to see the offsets and sizes in hexa‐
485 decimal:
486
487 $ pahole --hex thread_struct
488 struct thread_struct {
489 struct desc_struct tls_array[3]; /* 0 0x18 */
490 long unsigned int sp; /* 0x18 0x8 */
491 short unsigned int es; /* 0x20 0x2 */
492 short unsigned int ds; /* 0x22 0x2 */
493 short unsigned int fsindex; /* 0x24 0x2 */
494 short unsigned int gsindex; /* 0x26 0x2 */
495 long unsigned int fsbase; /* 0x28 0x8 */
496 long unsigned int gsbase; /* 0x30 0x8 */
497 struct perf_event * ptrace_bps[4]; /* 0x38 0x20 */
498 /* --- cacheline 1 boundary (64 bytes) was 24 bytes ago --- */
499 long unsigned int debugreg6; /* 0x58 0x8 */
500 long unsigned int ptrace_dr7; /* 0x60 0x8 */
501 long unsigned int cr2; /* 0x68 0x8 */
502 long unsigned int trap_nr; /* 0x70 0x8 */
503 long unsigned int error_code; /* 0x78 0x8 */
504 /* --- cacheline 2 boundary (128 bytes) --- */
505 struct io_bitmap * io_bitmap; /* 0x80 0x8 */
506 long unsigned int iopl_emul; /* 0x88 0x8 */
507 mm_segment_t addr_limit; /* 0x90 0x8 */
508 unsigned int sig_on_uaccess_err:1; /* 0x98: 0 0x4 */
509 unsigned int uaccess_err:1; /* 0x98:0x1 0x4 */
510
511 /* XXX 30 bits hole, try to pack */
512 /* XXX 36 bytes hole, try to pack */
513
514 /* --- cacheline 3 boundary (192 bytes) --- */
515 struct fpu fpu; /* 0xc0 0x1040 */
516
517 /* size: 4352, cachelines: 68, members: 20 */
518 /* sum members: 4312, holes: 1, sum holes: 36 */
519 /* sum bitfield members: 2 bits, bit holes: 1, sum bit holes: 30 bits */
520 };
521 $
522
523
524 OK, I know the offset that causes its a 'struct thread_struct' and that
525 the offset is 0x178, so must be in that 'fpu' struct... No problem, ex‐
526 pand 'struct thread_struct' and combine with grep:
527
528 $ pahole --hex -E thread_struct | egrep '(0x178|struct fpu)' -B4 -A4
529 /* XXX 30 bits hole, try to pack */
530 /* XXX 36 bytes hole, try to pack */
531
532 /* --- cacheline 3 boundary (192 bytes) --- */
533 struct fpu {
534 unsigned int last_cpu; /* 0xc0 0x4 */
535
536 /* XXX 4 bytes hole, try to pack */
537
538 --
539 /* typedef u8 -> __u8 */ unsigned char alimit; /* 0x171 0x1 */
540
541 /* XXX 6 bytes hole, try to pack */
542
543 struct math_emu_info * info; /* 0x178 0x8 */
544 /* --- cacheline 6 boundary (384 bytes) --- */
545 /* typedef u32 -> __u32 */ unsigned int entry_eip; /* 0x180 0x4 */
546 } soft; /* 0x100 0x88 */
547 struct xregs_state {
548 $
549
550
551 Want to know where 'struct thread_struct' is defined in the kernel
552 sources?
553
554 $ pahole -I thread_struct | head -2
555 /* Used at: /sys/kernel/btf/vmlinux */
556 /* <0> (null):0 */
557 $
558
559
560 Not present in BTF, so use DWARF, takes a little bit longer, and assum‐
561 ing it finds the matching vmlinux file:
562
563 $ pahole -Fdwarf -I thread_struct | head -2
564 /* Used at: /home/acme/git/linux/arch/x86/kernel/head64.c */
565 /* <3333> /home/acme/git/linux/arch/x86/include/asm/processor.h:485 */
566 $
567
568
569 To find the biggest data structures in the Linux kernel:
570
571 $ pahole -s | sort -k2 -nr | head -5
572 cmp_data 290904 1
573 dec_datas 274520 1
574 cpu_entry_area 217088 0
575 pglist_data 172928 4
576 saved_cmdlines_buffer 131104 1
577 $
578
579 The second column is the size in bytes and the third is the number of
580 alignment holes in that structure.
581
582 Show data structures that have a raw spinlock and are related to the
583 RCU mechanism:
584
585 $ pahole --contains raw_spinlock_t --prefix rcu
586 rcu_node
587 rcu_data
588 rcu_state
589 $
590
591 To see that in context, combine it with grep:
592
593 $ pahole rcu_state | grep raw_spinlock_t -B1 -A5
594 /* --- cacheline 52 boundary (3328 bytes) --- */
595 raw_spinlock_t ofl_lock; /* 3328 4 */
596
597 /* size: 3392, cachelines: 53, members: 35 */
598 /* sum members: 3250, holes: 7, sum holes: 82 */
599 /* padding: 60 */
600 };
601 $
602
603
605 pahole can also use the data structure types to pretty print raw data
606 specified via --prettify. To consume raw data from the standard input,
607 just use '--prettify -'
608
609 It can also pretty print raw data from stdin according to the type
610 specified:
611
612 $ pahole -C modversion_info drivers/scsi/sg.ko
613 struct modversion_info {
614 long unsigned int crc; /* 0 8 */
615 char name[56]; /* 8 56 */
616
617 /* size: 64, cachelines: 1, members: 2 */
618 };
619 $
620 $ objcopy -O binary --only-section=__versions drivers/scsi/sg.ko versions
621 $
622 $ ls -la versions
623 -rw-rw-r--. 1 acme acme 7616 Jun 25 11:33 versions
624 $
625 $ pahole --count 3 -C modversion_info drivers/scsi/sg.ko --prettify versions
626 {
627 .crc = 0x8dabd84,
628 .name = "module_layout",
629 },
630 {
631 .crc = 0x45e4617b,
632 .name = "no_llseek",
633 },
634 {
635 .crc = 0xa23fae8c,
636 .name = "param_ops_int",
637 },
638 $
639 $ pahole --skip 1 --count 2 -C modversion_info drivers/scsi/sg.ko --prettify - < versions
640 {
641 .crc = 0x45e4617b,
642 .name = "no_llseek",
643 },
644 {
645 .crc = 0xa23fae8c,
646 .name = "param_ops_int",
647 },
648 $
649 This is equivalent to:
650
651 $ pahole --seek_bytes 64 --count 1 -C modversion_info drivers/scsi/sg.ko --prettify versions
652 {
653 .crc = 0x45e4617b,
654 .name = "no_llseek",
655 },
656 $
657
658 -C, --class_name=CLASS_NAME
659 Pretty print according to this class. Arguments may be passed to
660 it to affect how the pretty printing is performed, e.g.:
661
662
663 -C 'perf_event_header(sizeof,type,type_enum=perf_event_type,filter=type==PERF_RECORD_EXIT)'
664
665 This would select the 'struct perf_event_header' as the type to use to
666 pretty print records states that the 'size' field in that struct should
667 be used to figure out the size of the record (variable sized records),
668 that the 'enum perf_event_type' should be used to pretty print the nu‐
669 meric value in perf_event_header->type and furthermore that it should
670 be used to heuristically look for structs with the same name (lower‐
671 case) of the enum entry that is converted from the type field, using it
672 to pretty print instead of the base 'perf_event_header' type. See the
673 PRETTY PRINTING EXAMPLES section below.
674
675 Furthermore the 'filter=' part can be used, so far with only the '=='
676 operator to filter based on the 'type' field and converting the string
677 'PERF_RECORD_EXIT' to a number according to type_enum.
678
679 The 'sizeof' arg defaults to the 'size' member name, if the name is
680 different, one can use
681 'sizeof=sz' form, ditto for 'type=other_member_name' field, that de‐
682 faults to 'type'.
683
684
686 Looking at the ELF header for a vmlinux file, using BTF, first lets
687 discover the ELF header type:
688
689 $ pahole --sizes | grep -i elf | grep -i _h
690 elf64_hdr 64 0
691 elf32_hdr 52 0
692 $
693
694 Now we can use this to show the first record from offset zero:
695
696 $ pahole -C elf64_hdr --count 1 --prettify /lib/modules/5.8.0-rc3+/build/vmlinux
697 {
698 .e_ident = { 127, 69, 76, 70, 2, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
699 .e_type = 2,
700 .e_machine = 62,
701 .e_version = 1,
702 .e_entry = 16777216,
703 .e_phoff = 64,
704 .e_shoff = 775923840,
705 .e_flags = 0,
706 .e_ehsize = 64,
707 .e_phentsize = 56,
708 .e_phnum = 5,
709 .e_shentsize = 64,
710 .e_shnum = 80,
711 .e_shstrndx = 79,
712 },
713 $
714
715 This is equivalent to:
716
717 $ pahole --header elf64_hdr --prettify /lib/modules/5.8.0-rc3+/build/vmlinux
718
719 The --header option also allows reference in other command line options
720 to fields in the header. This is useful when one wants to show multi‐
721 ple records in a file and the range where those fields are located is
722 specified in header fields, such as for perf.data files:
723
724 $ pahole --hex ~/bin/perf --header perf_file_header --prettify perf.data
725 {
726 .magic = 0x32454c4946524550,
727 .size = 0x68,
728 .attr_size = 0x88,
729 .attrs = {
730 .offset = 0xa8,
731 .size = 0x88,
732 },
733 .data = {
734 .offset = 0x130,
735 .size = 0x588,
736 },
737 .event_types = {
738 .offset = 0,
739 .size = 0,
740 },
741 .adds_features = { 0x16717ffc, 0, 0, 0 },
742 },
743 $
744
745 So to display the cgroups records in the perf_file_header.data section
746 we can use:
747
748 $ pahole ~/bin/perf --header=perf_file_header --seek_bytes '$header.data.offset' --size_bytes='$header.data.size' -C 'perf_event_header(sizeof,type,type_enum=perf_event_type,filter=type==PERF_RECORD_CGROUP)' --prettify perf.data
749 {
750 .header = {
751 .type = PERF_RECORD_CGROUP,
752 .misc = 0,
753 .size = 40,
754 },
755 .id = 1,
756 .path = "/",
757 },
758 {
759 .header = {
760 .type = PERF_RECORD_CGROUP,
761 .misc = 0,
762 .size = 48,
763 },
764 .id = 1553,
765 .path = "/system.slice",
766 },
767 {
768 .header = {
769 .type = PERF_RECORD_CGROUP,
770 .misc = 0,
771 .size = 48,
772 },
773 .id = 8,
774 .path = "/machine.slice",
775 },
776 {
777 .header = {
778 .type = PERF_RECORD_CGROUP,
779 .misc = 0,
780 .size = 128,
781 },
782 .id = 7828,
783 .path = "/machine.slice/libpod-42be8e8d4eb9d22405845005f0d04ea398548dccc934a150fbaa3c1f1f9492c2.scope",
784 },
785 {
786 .header = {
787 .type = PERF_RECORD_CGROUP,
788 .misc = 0,
789 .size = 88,
790 },
791 .id = 13,
792 .path = "/machine.slice/machine-qemu\x2d1\x2drhel6.sandy.scope",
793 },
794 $
795
796 For the common case of the header having a member that has the 'offset'
797 and 'size' members, it is possible to use this more compact form:
798
799 $ pahole ~/bin/perf --header=perf_file_header --range=data -C 'perf_event_header(sizeof,type,type_enum=perf_event_type,filter=type==PERF_RECORD_CGROUP)' --prettify perf.data
800
801 This uses ~/bin/perf to get the type definitions, the defines 'struct
802 perf_file_header' as the header, then seeks '$header.data.offset' bytes
803 from the start of the file, and considers '$header.data.size' bytes
804 worth of such records. The filter expression may omit a common prefix,
805 in this case it could additionally be equivalently written as both
806 'filter=type==CGROUP' or the 'filter=' can also be omitted, getting as
807 compact as 'type==CGROUP':
808
809 If we look at:
810
811 $ pahole ~/bin/perf -C perf_event_header
812 struct perf_event_header {
813 __u32 type; /* 0 4 */
814 __u16 misc; /* 4 2 */
815 __u16 size; /* 6 2 */
816
817 /* size: 8, cachelines: 1, members: 3 */
818 /* last cacheline: 8 bytes */
819 };
820 $
821
822 And:
823
824 $ pahole ~/bin/perf -C perf_event_type
825 enum perf_event_type {
826 PERF_RECORD_MMAP = 1,
827 PERF_RECORD_LOST = 2,
828 PERF_RECORD_COMM = 3,
829 PERF_RECORD_EXIT = 4,
830 PERF_RECORD_THROTTLE = 5,
831 PERF_RECORD_UNTHROTTLE = 6,
832 PERF_RECORD_FORK = 7,
833 PERF_RECORD_READ = 8,
834 PERF_RECORD_SAMPLE = 9,
835 PERF_RECORD_MMAP2 = 10,
836 PERF_RECORD_AUX = 11,
837 PERF_RECORD_ITRACE_START = 12,
838 PERF_RECORD_LOST_SAMPLES = 13,
839 PERF_RECORD_SWITCH = 14,
840 PERF_RECORD_SWITCH_CPU_WIDE = 15,
841 PERF_RECORD_NAMESPACES = 16,
842 PERF_RECORD_KSYMBOL = 17,
843 PERF_RECORD_BPF_EVENT = 18,
844 PERF_RECORD_CGROUP = 19,
845 PERF_RECORD_TEXT_POKE = 20,
846 PERF_RECORD_MAX = 21,
847 };
848 $
849
850 And furthermore:
851
852 $ pahole ~/bin/perf -C perf_record_cgroup
853 struct perf_record_cgroup {
854 struct perf_event_header header; /* 0 8 */
855 __u64 id; /* 8 8 */
856 char path[4096]; /* 16 4096 */
857
858 /* size: 4112, cachelines: 65, members: 3 */
859 /* last cacheline: 16 bytes */
860 };
861 $
862
863 Then we can see how the perf_event_header.type could be converted from
864 a __u32 to a string (PERF_RECORD_CGROUP). If we remove that
865 type_enum=perf_event_type, we will lose the conversion of 'struct
866 perf_event_header' to the more descriptive 'struct perf_record_cgroup',
867 and also the beautification of the header.type field:
868
869 $ pahole ~/bin/perf --header=perf_file_header --seek_bytes '$header.data.offset' --size_bytes='$header.data.size' -C 'perf_event_header(sizeof,type,filter=type==19)' --prettify perf.data
870 {
871 .type = 19,
872 .misc = 0,
873 .size = 40,
874 },
875 {
876 .type = 19,
877 .misc = 0,
878 .size = 48,
879 },
880 {
881 .type = 19,
882 .misc = 0,
883 .size = 48,
884 },
885 {
886 .type = 19,
887 .misc = 0,
888 .size = 128,
889 },
890 {
891 .type = 19,
892 .misc = 0,
893 .size = 88,
894 },
895 $
896
897 Some of the records are not found in 'type_enum=perf_event_type' so
898 some of the records don't get converted to a type that fully shows its
899 contents. For perf we know that those are in another enumeration, 'enum
900 perf_user_event_type', so, for these cases, we can create a 'virtual
901 enum', i.e. the sum of two enums and then get all those entries decoded
902 and properly casted, first few records with just 'enum
903 perf_event_type':
904
905 $ pahole ~/bin/perf --header=perf_file_header --seek_bytes '$header.data.offset' --size_bytes='$header.data.size' -C 'perf_event_header(sizeof,type,type_enum=perf_event_type)' --count 4 --prettify perf.data
906 {
907 .type = 79,
908 .misc = 0,
909 .size = 32,
910 },
911 {
912 .type = 73,
913 .misc = 0,
914 .size = 40,
915 },
916 {
917 .type = 74,
918 .misc = 0,
919 .size = 32,
920 },
921 {
922 .header = {
923 .type = PERF_RECORD_CGROUP,
924 .misc = 0,
925 .size = 40,
926 },
927 .id = 1,
928 .path = "/",
929 },
930 $
931
932 Now with both enumerations, i.e. with
933 'type_enum=perf_event_type+perf_user_event_type':
934
935 $ pahole ~/bin/perf --header=perf_file_header --seek_bytes '$header.data.offset' --size_bytes='$header.data.size' -C 'perf_event_header(sizeof,type,type_enum=perf_event_type+perf_user_event_type)' --count 5 --prettify perf.data
936 {
937 .header = {
938 .type = PERF_RECORD_TIME_CONV,
939 .misc = 0,
940 .size = 32,
941 },
942 .time_shift = 31,
943 .time_mult = 1016803377,
944 .time_zero = 435759009518382,
945 },
946 {
947 .header = {
948 .type = PERF_RECORD_THREAD_MAP,
949 .misc = 0,
950 .size = 40,
951 },
952 .nr = 1,
953 .entries = 0x50 0x7e 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00,
954 },
955 {
956 .header = {
957 .type = PERF_RECORD_CPU_MAP,
958 .misc = 0,
959 .size = 32,
960 },
961 .data = {
962 .type = 1,
963 .data = "",
964 },
965 },
966 {
967 .header = {
968 .type = PERF_RECORD_CGROUP,
969 .misc = 0,
970 .size = 40,
971 },
972 .id = 1,
973 .path = "/",
974 },
975 {
976 .header = {
977 .type = PERF_RECORD_CGROUP,
978 .misc = 0,
979 .size = 48,
980 },
981 .id = 1553,
982 .path = "/system.slice",
983 },
984 $
985
986 It is possible to pass multiple types, one has only to make sure they
987 appear in the file in sequence, i.e. for the perf.data example, see the
988 perf_file_header dump above, one can print the perf_file_attr structs
989 in the header attrs range, then the perf_event_header in the data range
990 with the following command:
991
992 pahole ~/bin/perf --header=perf_file_header -C 'perf_file_attr(range=attrs),perf_event_header(range=data,sizeof,type,type_enum=perf_event_type+perf_user_event_type)' --prettify perf.data
993
994
996 eu-readelf(1), readelf(1), objdump(1).
997
998 https://www.kernel.org/doc/ols/2007/ols2007v2-pages-35-44.pdf.
999
1001 pahole was written and is maintained by Arnaldo Carvalho de Melo
1002 <acme@kernel.org>.
1003
1004 Thanks to Andrii Nakryiko and Martin KaFai Lau for providing the BTF
1005 encoder and improving the codebase while making sure the BTF encoder
1006 works as needed to be used in encoding the Linux kernel .BTF section
1007 from the DWARF info generated by gcc. For that Andrii wrote a BTF dedu‐
1008 plicator in libbpf that is used by pahole.
1009
1010 Also thanks to Conectiva, Mandriva and Red Hat for allowing me to work
1011 on these tools.
1012
1013 Please send bug reports to <dwarves@vger.kernel.org>.
1014
1015 No subscription is required.
1016
1017
1018
1019dwarves January 16, 2020 pahole(1)