1pahole(1)                           dwarves                          pahole(1)
2
3
4

NAME

6       pahole - Shows, manipulates data structure layout and pretty prints raw
7       data.
8

SYNOPSIS

10       pahole [options] files
11

DESCRIPTION

13       pahole shows data structure layouts encoded  in  debugging  information
14       formats, DWARF, CTF and BTF being supported.
15
16       This  is  useful  for,  among  other  things: optimizing important data
17       structures by reducing its size, figuring out what is the field sitting
18       at  an  offset  from  the  start of a data structure, investigating ABI
19       changes and more generally understanding a new  codebase  you  have  to
20       work with.
21
22       It  also  uses these structure layouts to pretty print data feed to its
23       standard input, e.g.:
24
25       $ pahole --header elf64_hdr --prettify /lib/modules/5.8.0-rc6+/build/vmlinux
26       {
27            .e_ident = { 127, 69, 76, 70, 2, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
28            .e_type = 2,
29            .e_machine = 62,
30            .e_version = 1,
31            .e_entry = 16777216,
32            .e_phoff = 64,
33            .e_shoff = 604653784,
34            .e_flags = 0,
35            .e_ehsize = 64,
36            .e_phentsize = 56,
37            .e_phnum = 5,
38            .e_shentsize = 64,
39            .e_shnum = 80,
40            .e_shstrndx = 79,
41       },
42       $
43
44       See the PRETTY PRINTING section for further examples and documentation.
45
46       The files must have associated debugging information.  This information
47       may be inside the file itself, in ELF sections, or in another file.
48
49       One  way  to  have  this information is to specify the -g option to the
50       compiler when building it. When this is done the  information  will  be
51       stored  in  an  ELF section. For the DWARF debugging information format
52       this, adds, among others, the .debug_info ELF section. For  CTF  it  is
53       found  in  just  one  ELF section, .SUNW_ctf. BTF comes in at least the
54       .BTF ELF section, and may come also with the .BTF.ext ELF section.
55
56       The debuginfo packages available in most Linux distributions  are  also
57       supported  by pahole, where the debugging information is available in a
58       separate file.
59
60       By default, pahole shows the layout of all named structs in  the  files
61       specified.
62
63       If no files are specified, then it will look if the /sys/kernel/btf/vm‐
64       linux is present, using the BTF information present  in  it  about  the
65       running kernel, i.e. this works:
66
67       $ pahole list_head
68       struct list_head {
69            struct list_head *         next;                 /*     0     8 */
70            struct list_head *         prev;                 /*     8     8 */
71
72            /* size: 16, cachelines: 1, members: 2 */
73            /* last cacheline: 16 bytes */
74       };
75       $
76
77       If  BTF  is  not  present  and  no  file is passed, then a vmlinux that
78       matches the build-id for the running kernel will be looked  up  in  the
79       usual  places,  including  where  the kernel debuginfo packages put it,
80       looking for DWARF info instead.
81
82       See the EXAMPLES section for more usage suggestions.
83
84       It also pretty prints whatever is fed to its standard input,  according
85       to the type specified, see the EXAMPLE session.
86
87       Use --count to state how many records should be pretty printed.
88
89

OPTIONS

91       pahole supports the following options.
92
93
94       -C, --class_name=CLASS_NAMES
95              Show  just  these classes. This can be a comma separated list of
96              class names or file URLs (e.g.: file://class_list.txt)
97
98
99       -c, --cacheline_size=SIZE
100              Set cacheline size to SIZE bytes.
101
102
103       --sort Sort the output by type name, maybe  this  will  grow  to  allow
104              sorting by other criteria.
105
106              This is mostly needed so that pretty printing from BTF and DWARF
107              can be comparable when using  using  multiple  threads  to  load
108              DWARF  data,  when the order that the types in the compile units
109              is processed is not deterministic.
110
111
112       --compile
113              Generate compileable code, with all definitions for  all  types,
114              i.e.:
115
116       $ pahole --compile > vmlinux.h
117
118       Produces a header that can be included in a C source file and built. In
119       the example provided it will use the BTF info if  available,  otherwise
120       will look for a DWARF file matching the running kernel build-id.
121
122
123       --skip_emitting_atomic_typedefs
124              Do not emit 'typedef _Atomic int atomic_int' & friends when used
125              with options like --compile. Use it  if  the  compiler  provides
126              these  already,  as  of circa 2022 with gcc 12.2.1 those are not
127              encoded in DWARF so to generate compilable  code  we  need  emit
128              those  typedefs for the atomic types used in the data structures
129              being emitted from debugging information.
130
131
132       --count=COUNT
133              Pretty print the first COUNT records from input.
134
135
136       --skip=COUNT
137              Skip COUNT input records.
138
139
140       -E, --expand_types
141              Expand class members. Useful to find in  what  member  of  inner
142              structs where an offset from the beginning of a struct is.
143
144
145       -F, --format_path
146              Allows  specifying a list of debugging formats to try, in order.
147              Right now this includes "btf", "ctf" and  "dwarf".  The  default
148              format path used is equivalent to "-F dwarf,btf,ctf".
149
150
151       --hashbits=BITS
152              Allows  specifying  the  number of bits for the debugging format
153              loader to use.  The only one affected so far is the "dwarf" one,
154              its  default now is 15, the maximum for it is now 21 bits. Tweak
155              it to see if it improves performance as the kernel  evolves  and
156              more types and functions have to be loaded.
157
158
159       --hex  Print offsets and sizes in hexadecimal.
160
161
162       -r, --rel_offset
163              Show relative offsets of members in inner structs.
164
165
166       -p, --expand_pointers
167              Expand class pointer members.
168
169
170       -R, --reorganize
171              Reorganize struct, demoting and combining bitfields, moving mem‐
172              bers to remove alignment holes and padding.
173
174
175       -S, --show_reorg_steps
176              Show the struct layout at each reorganization step.
177
178
179       -i, --contains=CLASS_NAME
180              Show classes that contains CLASS_NAME.
181
182
183       -a, --anon_include
184              Include anonymous classes.
185
186
187       -A, --nested_anon_include
188              Include nested (inside other structs) anonymous classes.
189
190
191       -B, --bit_holes=NR_HOLES
192              Show only structs at least NR_HOLES bit holes.
193
194
195       -d, --recursive
196              Recursive mode, affects several other flags.
197
198
199       -D, --decl_exclude=PREFIX
200              exclude classes declared in files with PREFIX.
201
202
203       -f, --find_pointers_to=CLASS_NAME
204              Find pointers to CLASS_NAME.
205
206
207       -H, --holes=NR_HOLES
208              Show only structs with at least NR_HOLES holes.
209
210
211       -I, --show_decl_info
212              Show the file and line number where the tags  were  defined,  if
213              available in the debugging information.
214
215
216       --skip_encoding_btf_vars
217              Do not encode VARs in BTF.
218
219
220       --skip_encoding_btf_decl_tag
221              Do not encode decl tags in BTF.
222
223
224       --skip_encoding_btf_enum64
225              Do not encode enum64 in BTF.
226
227
228       --skip_encoding_btf_type_tag
229              Do not encode type tags in BTF.
230
231
232       --skip_encoding_btf_inconsistent_proto
233              Do not encode functions with multiple inconsistent prototypes or
234              unexpected register use for their parameters, where  the  regis‐
235              ters used do not match calling conventions.
236
237
238       -j, --jobs=N
239              Run  N jobs in parallel. Defaults to number of online processors
240              + 10% (like the 'ninja' build system) if no argument  is  speci‐
241              fied.
242
243
244       -J, --btf_encode
245              Encode  BTF  information  from  DWARF,  used in the Linux kernel
246              build process when CONFIG_DEBUG_INFO_BTF=y  is  present,  intro‐
247              duced  in Linux v5.2. Used to implement features such as BPF CO-
248              RE (Compile Once - Run Everywhere).
249
250              See https://nakryiko.com/posts/bpf-portability-and-co-re/.
251
252
253       --btf_encode_detached=FILENAME
254              Same thing as -J/--btf_encode, but storing the raw BTF info into
255              a separate file.
256
257
258       --btf_encode_force
259              Ignore those symbols found invalid when encoding BTF.
260
261
262       --btf_base=PATH
263              Path  to  the base BTF file, for instance: vmlinux when encoding
264              kernel module BTF information.  This may be inferred when asking
265              for  a /sys/kernel/btf/MODULE, when it will be autoconfigured to
266              "/sys/kernel/btf/vmlinux".
267
268
269       --btf_gen_floats
270              Allow producing BTF_KIND_FLOAT entries in systems where the  vm‐
271              linux DWARF information has float types.
272
273
274       --btf_gen_optimized
275              Generate  BTF  for  functions with optimization-related suffixes
276              (.isra, .constprop).
277
278
279       --btf_gen_all
280              Allow using all the BTF features supported by pahole.
281
282
283       -l, --show_first_biggest_size_base_type_member
284              Show first biggest size base_type member.
285
286
287       -m, --nr_methods
288              Show number of methods of all classes, i.e. the number of  func‐
289              tions have arguments that are pointers to a given class.
290
291              To get the number of methods for an specific class, please use:
292
293                $ pahole --nr_methods | grep -w sock
294                sock  1005
295                $
296
297              In  the  above  example it used the BTF information in /sys/ker‐
298              nel/btf/vmlinux.
299
300
301       -M, --show_only_data_members
302              Show only the members that use space in the  class  layout.  C++
303              methods will be suppressed.
304
305
306       -n, --nr_members
307              Show number of members.
308
309
310       -N, --class_name_len
311              Show size of classes.
312
313
314       -O, --dwarf_offset=OFFSET
315              Show tag with DWARF OFFSET.
316
317
318       -P, --packable
319              Show  only  structs that has holes that can be packed if members
320              are reorganized, for instance when using  the  --reorganize  op‐
321              tion.
322
323
324       -P, --with_flexible_array
325              Show only structs that have a flexible array.
326
327
328       -q, --quiet
329              Be quieter.
330
331
332       -s, --sizes
333              Show size of classes.
334
335
336       -t, --separator=SEP
337              Use SEP as the field separator.
338
339
340       -T, --nr_definitions
341              Show how many times struct was defined.
342
343
344       -u, --defined_in
345              Show CUs where CLASS_NAME (-C) is defined.
346
347
348       --flat_arrays
349              Flatten  arrays, so that array[10][2] becomes array[20].  Useful
350              when generating from both CTF/BTF and DWARF  encodings  for  the
351              same binary for testing purposes.
352
353
354       --suppress_aligned_attribute
355              Suppress  forced  alignment markers, so that one can compare BTF
356              or CTF output, that don't have that info, to output  from  DWARF
357              >= 5.
358
359
360       --suppress_force_paddings
361
362              Suppress  bitfield forced padding at the end of structs, as this
363              requires something like DWARF's DW_AT_alignment, so that one can
364              compare BTF or CTF output, that don't have that info.
365
366
367       --suppress_packed
368
369              Suppress    the    output   of   the   inference   of   __attri‐
370              bute__((__packed__)), so that one can compare BTF or CTF output,
371              the inference algorithm uses things like DW_AT_alignment, so un‐
372              til it is improved to infer that as well  for  BTF,  allow  dis‐
373              abling this output.
374
375
376       --fixup_silly_bitfields
377              Converts  silly  bitfields  such  as  "int foo:32" to plain "int
378              foo".
379
380
381       -V, --verbose
382              be verbose
383
384
385       --ptr_table_stats
386              Print statistics about ptr_table data structures, used  to  hold
387              all  the types, tags and functions data structures, for develop‐
388              ment tuning of such tables, tuned for  a  typical  2021  vmlinux
389              file.
390
391
392       -w, --word_size=WORD_SIZE
393              Change the arch word size to WORD_SIZE.
394
395
396       -x, --exclude=PREFIX
397              Exclude PREFIXed classes.
398
399
400       -X, --cu_exclude=PREFIX
401              Exclude PREFIXed compilation units.
402
403
404       --lang=languages
405              Only process compilation units built from source code written in
406              the specified languages.
407
408              Supported languages:
409
410                ada83, ada95, asm, bliss, c, c89, c99, c11, c++, c++03, c++11,
411              c++14, cobol74,
412                cobol85, d, dylan, fortran77, fortran90, fortran95, fortran03,
413              fortran08,
414                go, haskell, java,  julia,  modula2,  modula3,  objc,  objc++,
415              ocaml, opencl,
416                pascal83, pli, python, renderscript, rust, swift, upc
417
418              The  linux kernel, for instance, is written in 'c89' circa 2022,
419              use that in filters.
420
421              --lang_exclude=languages Don't process compilation  units  built
422              from source code written in the specified languages.
423
424              To  filter  out compilation units written in Rust, for instance,
425              use:
426
427               pahole -j --btf_encode --lang_exclude rust
428
429
430       -y, --prefix_filter=PREFIX
431              Include PREFIXed classes.
432
433
434       -z, --hole_size_ge=HOLE_SIZE
435              Show only structs with at least one hole  greater  or  equal  to
436              HOLE_SIZE.
437
438
439       --structs
440              Show  only  structs,  all  the other filters apply, i.e. to show
441              just the sizes of all structs combine  --structs  with  --sizes,
442              etc.
443
444
445       --packed
446              Show  only  packed structs, all the other filters apply, i.e. to
447              show just the sizes of all packed structs combine --packed  with
448              --sizes, etc.
449
450
451       --unions
452              Show only unions, all the other filters apply, i.e. to show just
453              the sizes of all unions combine --union with --sizes, etc.
454
455
456       --version
457              Show a traditional string version, i.e.: "v1.18".
458
459
460       --numeric_version
461              Show a numeric only version, suitable for use in  Makefiles  and
462              scripts  where  one  wants to know what if the installed version
463              has some feature, i.e.: 118 instead of "v1.18".
464
465
466       --kabi_prefix=STRING
467              When the prefix of the string is STRING,  treat  the  string  as
468              STRING.
469
470

NOTES

472       To  enable  the generation of debugging information in the Linux kernel
473       build process select CONFIG_DEBUG_INFO. This can  be  done  using  make
474       menuconfig  by  this path: "Kernel Hacking" -> "Compile-time checks and
475       compiler options" -> "Compile the kernel with debug info". Consider  as
476       well  enabling  CONFIG_DEBUG_INFO_BTF  by going thru the aforementioned
477       menuconfig path and then selecting "Generate BTF typeinfo". Most modern
478       distributions  with  eBPF support should come with that in all its ker‐
479       nels, greatly facilitating the use of pahole.
480
481       Many distributions also come with debuginfo packages, so just enable it
482       in  your  package manager repository configuration and install the ker‐
483       nel-debuginfo, or any other userspace program  written  in  a  language
484       that the compiler generates debuginfo (C, C++, for instance).
485
486

EXAMPLES

488       All  the  examples here use either /sys/kernel/btf/vmlinux, if present,
489       or lookup a vmlinux file matching the running kernel, using the  build-
490       id info found in /sys/kernel/notes to make sure it matches.
491
492       Show a type:
493
494       $ pahole -C __u64
495       typedef long long unsigned int __u64;
496       $
497
498
499       Works as well if the only argument is a type name:
500
501       $ pahole raw_spinlock_t
502       typedef struct raw_spinlock raw_spinlock_t;
503       $
504
505
506       Multiple types can be passed, separated by commas:
507
508       $ pahole raw_spinlock_t,raw_spinlock
509       struct raw_spinlock {
510            arch_spinlock_t            raw_lock;             /*     0     4 */
511
512            /* size: 4, cachelines: 1, members: 1 */
513            /* last cacheline: 4 bytes */
514       };
515       typedef struct raw_spinlock raw_spinlock_t;
516       $
517
518
519       Types can be expanded:
520
521       $ pahole -E raw_spinlock
522       struct raw_spinlock {
523               /* typedef arch_spinlock_t */ struct qspinlock {
524                       union {
525                               /* typedef atomic_t */ struct {
526                                       int counter;                                                  /*     0     4 */
527                               } val;                                                                /*     0     4 */
528                               struct {
529                                       /* typedef u8 -> __u8 */ unsigned char locked;                /*     0     1 */
530                                       /* typedef u8 -> __u8 */ unsigned char pending;               /*     1     1 */
531                               };                                                                    /*     0     2 */
532                               struct {
533                                       /* typedef u16 -> __u16 */ short unsigned int locked_pending; /*     0     2 */
534                                       /* typedef u16 -> __u16 */ short unsigned int tail;           /*     2     2 */
535                               };                                                                    /*     0     4 */
536                       };                                                                            /*     0     4 */
537               } raw_lock;                                                                           /*     0     4 */
538
539               /* size: 4, cachelines: 1, members: 1 */
540               /* last cacheline: 4 bytes */
541       };
542       $
543
544
545       When decoding OOPSes you may want to see the offsets and sizes in hexa‐
546       decimal:
547
548       $ pahole --hex thread_struct
549       struct thread_struct {
550               struct desc_struct         tls_array[3];         /*     0  0x18 */
551               long unsigned int          sp;                   /*  0x18   0x8 */
552               short unsigned int         es;                   /*  0x20   0x2 */
553               short unsigned int         ds;                   /*  0x22   0x2 */
554               short unsigned int         fsindex;              /*  0x24   0x2 */
555               short unsigned int         gsindex;              /*  0x26   0x2 */
556               long unsigned int          fsbase;               /*  0x28   0x8 */
557               long unsigned int          gsbase;               /*  0x30   0x8 */
558               struct perf_event *        ptrace_bps[4];        /*  0x38  0x20 */
559               /* --- cacheline 1 boundary (64 bytes) was 24 bytes ago --- */
560               long unsigned int          debugreg6;            /*  0x58   0x8 */
561               long unsigned int          ptrace_dr7;           /*  0x60   0x8 */
562               long unsigned int          cr2;                  /*  0x68   0x8 */
563               long unsigned int          trap_nr;              /*  0x70   0x8 */
564               long unsigned int          error_code;           /*  0x78   0x8 */
565               /* --- cacheline 2 boundary (128 bytes) --- */
566               struct io_bitmap *         io_bitmap;            /*  0x80   0x8 */
567               long unsigned int          iopl_emul;            /*  0x88   0x8 */
568               mm_segment_t               addr_limit;           /*  0x90   0x8 */
569               unsigned int               sig_on_uaccess_err:1; /*  0x98: 0 0x4 */
570               unsigned int               uaccess_err:1;        /*  0x98:0x1 0x4 */
571
572               /* XXX 30 bits hole, try to pack */
573               /* XXX 36 bytes hole, try to pack */
574
575               /* --- cacheline 3 boundary (192 bytes) --- */
576               struct fpu                 fpu;                  /*  0xc0 0x1040 */
577
578               /* size: 4352, cachelines: 68, members: 20 */
579               /* sum members: 4312, holes: 1, sum holes: 36 */
580               /* sum bitfield members: 2 bits, bit holes: 1, sum bit holes: 30 bits */
581       };
582       $
583
584
585       OK, I know the offset that causes its a 'struct thread_struct' and that
586       the offset is 0x178, so must be in that 'fpu' struct... No problem, ex‐
587       pand 'struct thread_struct' and combine with grep:
588
589       $ pahole --hex -E thread_struct | egrep '(0x178|struct fpu)' -B4 -A4
590               /* XXX 30 bits hole, try to pack */
591               /* XXX 36 bytes hole, try to pack */
592
593               /* --- cacheline 3 boundary (192 bytes) --- */
594               struct fpu {
595                       unsigned int       last_cpu;                                             /*  0xc0   0x4 */
596
597                       /* XXX 4 bytes hole, try to pack */
598
599       --
600                                       /* typedef u8 -> __u8 */ unsigned char alimit;           /* 0x171   0x1 */
601
602                                       /* XXX 6 bytes hole, try to pack */
603
604                                       struct math_emu_info * info;                             /* 0x178   0x8 */
605                                       /* --- cacheline 6 boundary (384 bytes) --- */
606                                       /* typedef u32 -> __u32 */ unsigned int entry_eip;       /* 0x180   0x4 */
607                               } soft; /* 0x100  0x88 */
608                               struct xregs_state {
609       $
610
611
612       Want to know where 'struct thread_struct'  is  defined  in  the  kernel
613       sources?
614
615       $ pahole -I thread_struct | head -2
616       /* Used at: /sys/kernel/btf/vmlinux */
617       /* <0> (null):0 */
618       $
619
620
621       Not present in BTF, so use DWARF, takes a little bit longer, and assum‐
622       ing it finds the matching vmlinux file:
623
624       $ pahole -Fdwarf -I thread_struct | head -2
625       /* Used at: /home/acme/git/linux/arch/x86/kernel/head64.c */
626       /* <3333> /home/acme/git/linux/arch/x86/include/asm/processor.h:485 */
627       $
628
629
630       To find the biggest data structures in the Linux kernel:
631
632       $ pahole -s | sort -k2 -nr | head -5
633       cmp_data               290904 1
634       dec_datas              274520 1
635       cpu_entry_area         217088 0
636       pglist_data            172928 4
637       saved_cmdlines_buffer  131104 1
638       $
639
640       The second column is the size in bytes and the third is the  number  of
641       alignment holes in that structure.
642
643       Show  data  structures  that have a raw spinlock and are related to the
644       RCU mechanism:
645
646       $ pahole --contains raw_spinlock_t --prefix rcu
647       rcu_node
648       rcu_data
649       rcu_state
650       $
651
652       To see that in context, combine it with grep:
653
654       $ pahole rcu_state | grep raw_spinlock_t -B1 -A5
655            /* --- cacheline 52 boundary (3328 bytes) --- */
656            raw_spinlock_t             ofl_lock;             /*  3328     4 */
657
658            /* size: 3392, cachelines: 53, members: 35 */
659            /* sum members: 3250, holes: 7, sum holes: 82 */
660            /* padding: 60 */
661       };
662       $
663
664

PRETTY PRINTING

666       pahole can also use the data structure types to pretty print  raw  data
667       specified via --prettify.  To consume raw data from the standard input,
668       just use '--prettify -'
669
670       It can also pretty print raw data from  stdin  according  to  the  type
671       specified:
672
673       $ pahole -C modversion_info drivers/scsi/sg.ko
674       struct modversion_info {
675             long unsigned int          crc;                  /*     0     8 */
676             char                       name[56];             /*     8    56 */
677
678             /* size: 64, cachelines: 1, members: 2 */
679       };
680       $
681       $ objcopy -O binary --only-section=__versions drivers/scsi/sg.ko versions
682       $
683       $ ls -la versions
684       -rw-rw-r--. 1 acme acme 7616 Jun 25 11:33 versions
685       $
686       $ pahole --count 3 -C modversion_info drivers/scsi/sg.ko --prettify versions
687       {
688             .crc = 0x8dabd84,
689             .name = "module_layout",
690       },
691       {
692             .crc = 0x45e4617b,
693             .name = "no_llseek",
694       },
695       {
696             .crc = 0xa23fae8c,
697             .name = "param_ops_int",
698       },
699       $
700       $ pahole --skip 1 --count 2 -C modversion_info drivers/scsi/sg.ko --prettify - < versions
701       {
702             .crc = 0x45e4617b,
703             .name = "no_llseek",
704       },
705       {
706             .crc = 0xa23fae8c,
707             .name = "param_ops_int",
708       },
709       $
710       This is equivalent to:
711
712       $ pahole --seek_bytes 64 --count 1 -C modversion_info drivers/scsi/sg.ko --prettify versions
713       {
714            .crc = 0x45e4617b,
715            .name = "no_llseek",
716       },
717       $
718
719       -C, --class_name=CLASS_NAME
720              Pretty print according to this class. Arguments may be passed to
721              it to affect how the pretty printing is performed, e.g.:
722
723
724           -C 'perf_event_header(sizeof,type,type_enum=perf_event_type,filter=type==PERF_RECORD_EXIT)'
725
726       This would select the 'struct perf_event_header' as the type to use  to
727       pretty print records states that the 'size' field in that struct should
728       be used to figure out the size of the record (variable sized  records),
729       that  the 'enum perf_event_type' should be used to pretty print the nu‐
730       meric value in perf_event_header->type and furthermore that  it  should
731       be  used  to  heuristically look for structs with the same name (lower‐
732       case) of the enum entry that is converted from the type field, using it
733       to  pretty  print instead of the base 'perf_event_header' type. See the
734       PRETTY PRINTING EXAMPLES section below.
735
736       Furthermore the 'filter=' part can be used, so far with only  the  '=='
737       operator  to filter based on the 'type' field and converting the string
738       'PERF_RECORD_EXIT' to a number according to type_enum.
739
740       The 'sizeof' arg defaults to the 'size' member name,  if  the  name  is
741       different, one can use
742        'sizeof=sz'  form,  ditto for 'type=other_member_name' field, that de‐
743       faults to 'type'.
744
745

PRETTY PRINTING EXAMPLES

747       Looking at the ELF header for a vmlinux file,  using  BTF,  first  lets
748       discover the ELF header type:
749
750       $ pahole --sizes | grep -i elf | grep -i _h
751       elf64_hdr 64   0
752       elf32_hdr 52   0
753       $
754
755       Now we can use this to show the first record from offset zero:
756
757       $ pahole -C elf64_hdr --count 1 --prettify /lib/modules/5.8.0-rc3+/build/vmlinux
758       {
759            .e_ident = { 127, 69, 76, 70, 2, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
760            .e_type = 2,
761            .e_machine = 62,
762            .e_version = 1,
763            .e_entry = 16777216,
764            .e_phoff = 64,
765            .e_shoff = 775923840,
766            .e_flags = 0,
767            .e_ehsize = 64,
768            .e_phentsize = 56,
769            .e_phnum = 5,
770            .e_shentsize = 64,
771            .e_shnum = 80,
772            .e_shstrndx = 79,
773       },
774       $
775
776       This is equivalent to:
777
778       $ pahole --header elf64_hdr --prettify /lib/modules/5.8.0-rc3+/build/vmlinux
779
780       The --header option also allows reference in other command line options
781       to fields in the header.  This is useful when one wants to show  multi‐
782       ple  records  in a file and the range where those fields are located is
783       specified in header fields, such as for perf.data files:
784
785       $ pahole --hex ~/bin/perf --header perf_file_header --prettify perf.data
786       {
787            .magic = 0x32454c4946524550,
788            .size = 0x68,
789            .attr_size = 0x88,
790            .attrs = {
791                 .offset = 0xa8,
792                 .size = 0x88,
793            },
794            .data = {
795                 .offset = 0x130,
796                 .size = 0x588,
797            },
798            .event_types = {
799                 .offset = 0,
800                 .size = 0,
801            },
802            .adds_features = { 0x16717ffc, 0, 0, 0 },
803       },
804       $
805
806       So to display the cgroups records in the perf_file_header.data  section
807       we can use:
808
809       $ pahole ~/bin/perf --header=perf_file_header --seek_bytes '$header.data.offset' --size_bytes='$header.data.size' -C 'perf_event_header(sizeof,type,type_enum=perf_event_type,filter=type==PERF_RECORD_CGROUP)' --prettify perf.data
810       {
811            .header = {
812                 .type = PERF_RECORD_CGROUP,
813                 .misc = 0,
814                 .size = 40,
815            },
816            .id = 1,
817            .path = "/",
818       },
819       {
820            .header = {
821                 .type = PERF_RECORD_CGROUP,
822                 .misc = 0,
823                 .size = 48,
824            },
825            .id = 1553,
826            .path = "/system.slice",
827       },
828       {
829            .header = {
830                 .type = PERF_RECORD_CGROUP,
831                 .misc = 0,
832                 .size = 48,
833            },
834            .id = 8,
835            .path = "/machine.slice",
836       },
837       {
838            .header = {
839                 .type = PERF_RECORD_CGROUP,
840                 .misc = 0,
841                 .size = 128,
842            },
843            .id = 7828,
844            .path = "/machine.slice/libpod-42be8e8d4eb9d22405845005f0d04ea398548dccc934a150fbaa3c1f1f9492c2.scope",
845       },
846       {
847            .header = {
848                 .type = PERF_RECORD_CGROUP,
849                 .misc = 0,
850                 .size = 88,
851            },
852            .id = 13,
853            .path = "/machine.slice/machine-qemu\x2d1\x2drhel6.sandy.scope",
854       },
855       $
856
857       For the common case of the header having a member that has the 'offset'
858       and 'size' members, it is possible to use this more compact form:
859
860       $ pahole ~/bin/perf --header=perf_file_header --range=data -C 'perf_event_header(sizeof,type,type_enum=perf_event_type,filter=type==PERF_RECORD_CGROUP)' --prettify perf.data
861
862       This uses ~/bin/perf to get the type definitions, the  defines  'struct
863       perf_file_header' as the header, then seeks '$header.data.offset' bytes
864       from the start of the file,  and  considers  '$header.data.size'  bytes
865       worth  of such records. The filter expression may omit a common prefix,
866       in this case it could additionally  be  equivalently  written  as  both
867       'filter=type==CGROUP'  or the 'filter=' can also be omitted, getting as
868       compact as 'type==CGROUP':
869
870       If we look at:
871
872       $ pahole ~/bin/perf -C perf_event_header
873       struct perf_event_header {
874            __u32                      type;                 /*     0     4 */
875            __u16                      misc;                 /*     4     2 */
876            __u16                      size;                 /*     6     2 */
877
878            /* size: 8, cachelines: 1, members: 3 */
879            /* last cacheline: 8 bytes */
880       };
881       $
882
883       And:
884
885       $ pahole ~/bin/perf -C perf_event_type
886       enum perf_event_type {
887            PERF_RECORD_MMAP = 1,
888            PERF_RECORD_LOST = 2,
889            PERF_RECORD_COMM = 3,
890            PERF_RECORD_EXIT = 4,
891            PERF_RECORD_THROTTLE = 5,
892            PERF_RECORD_UNTHROTTLE = 6,
893            PERF_RECORD_FORK = 7,
894            PERF_RECORD_READ = 8,
895            PERF_RECORD_SAMPLE = 9,
896            PERF_RECORD_MMAP2 = 10,
897            PERF_RECORD_AUX = 11,
898            PERF_RECORD_ITRACE_START = 12,
899            PERF_RECORD_LOST_SAMPLES = 13,
900            PERF_RECORD_SWITCH = 14,
901            PERF_RECORD_SWITCH_CPU_WIDE = 15,
902            PERF_RECORD_NAMESPACES = 16,
903            PERF_RECORD_KSYMBOL = 17,
904            PERF_RECORD_BPF_EVENT = 18,
905            PERF_RECORD_CGROUP = 19,
906            PERF_RECORD_TEXT_POKE = 20,
907            PERF_RECORD_MAX = 21,
908       };
909       $
910
911       And furthermore:
912
913       $ pahole ~/bin/perf -C perf_record_cgroup
914       struct perf_record_cgroup {
915            struct perf_event_header   header;               /*     0     8 */
916            __u64                      id;                   /*     8     8 */
917            char                       path[4096];           /*    16  4096 */
918
919            /* size: 4112, cachelines: 65, members: 3 */
920            /* last cacheline: 16 bytes */
921       };
922       $
923
924       Then we can see how the perf_event_header.type could be converted  from
925       a   __u32   to  a  string  (PERF_RECORD_CGROUP).   If  we  remove  that
926       type_enum=perf_event_type, we  will  lose  the  conversion  of  'struct
927       perf_event_header' to the more descriptive 'struct perf_record_cgroup',
928       and also the beautification of the header.type field:
929
930       $ pahole ~/bin/perf --header=perf_file_header --seek_bytes '$header.data.offset' --size_bytes='$header.data.size' -C 'perf_event_header(sizeof,type,filter=type==19)' --prettify perf.data
931       {
932            .type = 19,
933            .misc = 0,
934            .size = 40,
935       },
936       {
937            .type = 19,
938            .misc = 0,
939            .size = 48,
940       },
941       {
942            .type = 19,
943            .misc = 0,
944            .size = 48,
945       },
946       {
947            .type = 19,
948            .misc = 0,
949            .size = 128,
950       },
951       {
952            .type = 19,
953            .misc = 0,
954            .size = 88,
955       },
956       $
957
958       Some of the records are not  found  in  'type_enum=perf_event_type'  so
959       some  of the records don't get converted to a type that fully shows its
960       contents. For perf we know that those are in another enumeration, 'enum
961       perf_user_event_type',  so,  for  these cases, we can create a 'virtual
962       enum', i.e. the sum of two enums and then get all those entries decoded
963       and    properly   casted,   first   few   records   with   just   'enum
964       perf_event_type':
965
966       $ pahole ~/bin/perf --header=perf_file_header --seek_bytes '$header.data.offset' --size_bytes='$header.data.size' -C 'perf_event_header(sizeof,type,type_enum=perf_event_type)' --count 4 --prettify perf.data
967       {
968            .type = 79,
969            .misc = 0,
970            .size = 32,
971       },
972       {
973            .type = 73,
974            .misc = 0,
975            .size = 40,
976       },
977       {
978            .type = 74,
979            .misc = 0,
980            .size = 32,
981       },
982       {
983            .header = {
984                 .type = PERF_RECORD_CGROUP,
985                 .misc = 0,
986                 .size = 40,
987            },
988            .id = 1,
989            .path = "/",
990       },
991       $
992
993       Now       with        both        enumerations,        i.e.        with
994       'type_enum=perf_event_type+perf_user_event_type':
995
996       $ pahole ~/bin/perf --header=perf_file_header --seek_bytes '$header.data.offset' --size_bytes='$header.data.size' -C 'perf_event_header(sizeof,type,type_enum=perf_event_type+perf_user_event_type)' --count 5 --prettify perf.data
997       {
998            .header = {
999                 .type = PERF_RECORD_TIME_CONV,
1000                 .misc = 0,
1001                 .size = 32,
1002            },
1003            .time_shift = 31,
1004            .time_mult = 1016803377,
1005            .time_zero = 435759009518382,
1006       },
1007       {
1008            .header = {
1009                 .type = PERF_RECORD_THREAD_MAP,
1010                 .misc = 0,
1011                 .size = 40,
1012            },
1013            .nr = 1,
1014            .entries = 0x50 0x7e 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00,
1015       },
1016       {
1017            .header = {
1018                 .type = PERF_RECORD_CPU_MAP,
1019                 .misc = 0,
1020                 .size = 32,
1021            },
1022            .data = {
1023                 .type = 1,
1024                 .data = "",
1025            },
1026       },
1027       {
1028            .header = {
1029                 .type = PERF_RECORD_CGROUP,
1030                 .misc = 0,
1031                 .size = 40,
1032            },
1033            .id = 1,
1034            .path = "/",
1035       },
1036       {
1037            .header = {
1038                 .type = PERF_RECORD_CGROUP,
1039                 .misc = 0,
1040                 .size = 48,
1041            },
1042            .id = 1553,
1043            .path = "/system.slice",
1044       },
1045       $
1046
1047       It  is  possible to pass multiple types, one has only to make sure they
1048       appear in the file in sequence, i.e. for the perf.data example, see the
1049       perf_file_header  dump  above, one can print the perf_file_attr structs
1050       in the header attrs range, then the perf_event_header in the data range
1051       with the following command:
1052
1053       pahole ~/bin/perf --header=perf_file_header          -C 'perf_file_attr(range=attrs),perf_event_header(range=data,sizeof,type,type_enum=perf_event_type+perf_user_event_type)' --prettify perf.data
1054
1055

SEE ALSO

1057       eu-readelf(1), readelf(1), objdump(1).
1058
1059       https://www.kernel.org/doc/ols/2007/ols2007v2-pages-35-44.pdf.
1060

AUTHOR

1062       pahole  was  written  and  is  maintained  by  Arnaldo Carvalho de Melo
1063       <acme@kernel.org>.
1064
1065       Thanks to Andrii Nakryiko and Martin KaFai Lau for  providing  the  BTF
1066       encoder  and  improving  the codebase while making sure the BTF encoder
1067       works as needed to be used in encoding the Linux  kernel  .BTF  section
1068       from the DWARF info generated by gcc. For that Andrii wrote a BTF dedu‐
1069       plicator in libbpf that is used by pahole.
1070
1071       Also thanks to Conectiva, Mandriva and Red Hat for allowing me to  work
1072       on these tools.
1073
1074       Please send bug reports to <dwarves@vger.kernel.org>.
1075
1076       No subscription is required.
1077
1078
1079
1080dwarves                        January 16, 2020                      pahole(1)
Impressum