1PERF-INJECT(1) perf Manual PERF-INJECT(1)
2
3
4
6 perf-inject - Filter to augment the events stream with additional
7 information
8
10 perf inject <options>
11
13 perf-inject reads a perf-record event stream and repipes it to stdout.
14 At any point the processing code can inject other events into the event
15 stream - in this case build-ids (-b option) are read and injected as
16 needed into the event stream.
17
18 Build-ids are just the first user of perf-inject - potentially anything
19 that needs userspace processing to augment the events stream with
20 additional information could make use of this facility.
21
23 -b, --build-ids
24 Inject build-ids of DSOs hit by samples into the output stream.
25 This means it needs to process all SAMPLE records to find the DSOs.
26
27 --buildid-all
28 Inject build-ids of all DSOs into the output stream regardless of
29 hits and skip SAMPLE processing.
30
31 --known-build-ids=
32 Override build-ids to inject using these comma-separated pairs of
33 build-id and path. Understands file://filename to read these pairs
34 from a file, which can be generated with perf buildid-list.
35
36 -v, --verbose
37 Be more verbose.
38
39 -i, --input=
40 Input file name. (default: stdin)
41
42 -o, --output=
43 Output file name. (default: stdout)
44
45 -s, --sched-stat
46 Merge sched_stat and sched_switch for getting events where and how
47 long tasks slept. sched_switch contains a callchain where a task
48 slept and sched_stat contains a timeslice how long a task slept.
49
50 -k, --vmlinux=<file>
51 vmlinux pathname
52
53 --ignore-vmlinux
54 Ignore vmlinux files.
55
56 --kallsyms=<file>
57 kallsyms pathname
58
59 --itrace
60 Decode Instruction Tracing data, replacing it with synthesized
61 events. Options are:
62
63 i synthesize instructions events
64 y synthesize cycles events
65 b synthesize branches events (branch misses for Arm SPE)
66 c synthesize branches events (calls only)
67 r synthesize branches events (returns only)
68 x synthesize transactions events
69 w synthesize ptwrite events
70 p synthesize power events (incl. PSB events for Intel PT)
71 o synthesize other events recorded due to the use
72 of aux-output (refer to perf record)
73 I synthesize interrupt or similar (asynchronous) events
74 (e.g. Intel PT Event Trace)
75 e synthesize error events
76 d create a debug log
77 f synthesize first level cache events
78 m synthesize last level cache events
79 M synthesize memory events
80 t synthesize TLB events
81 a synthesize remote access events
82 g synthesize a call chain (use with i or x)
83 G synthesize a call chain on existing event records
84 l synthesize last branch entries (use with i or x)
85 L synthesize last branch entries on existing event records
86 s skip initial number of events
87 q quicker (less detailed) decoding
88 A approximate IPC
89 Z prefer to ignore timestamps (so-called "timeless" decoding)
90
91 The default is all events i.e. the same as --itrace=iybxwpe,
92 except for perf script where it is --itrace=ce
93
94 In addition, the period (default 100000, except for perf script where it is 1)
95 for instructions events can be specified in units of:
96
97 i instructions
98 t ticks
99 ms milliseconds
100 us microseconds
101 ns nanoseconds (default)
102
103 Also the call chain size (default 16, max. 1024) for instructions or
104 transactions events can be specified.
105
106 Also the number of last branch entries (default 64, max. 1024) for
107 instructions or transactions events can be specified.
108
109 Similar to options g and l, size may also be specified for options G and L.
110 On x86, note that G and L work poorly when data has been recorded with
111 large PEBS. Refer linkperf:perf-intel-pt[1] man page for details.
112
113 It is also possible to skip events generated (instructions, branches, transactions,
114 ptwrite, power) at the beginning. This is useful to ignore initialization code.
115
116 --itrace=i0nss1000000
117
118 skips the first million instructions.
119
120 The 'e' option may be followed by flags which affect what errors will or
121 will not be reported. Each flag must be preceded by either '+' or '-'.
122 The flags are:
123 o overflow
124 l trace data lost
125
126 If supported, the 'd' option may be followed by flags which affect what
127 debug messages will or will not be logged. Each flag must be preceded
128 by either '+' or '-'. The flags are:
129 a all perf events
130 e output only on errors (size configurable - see linkperf:perf-config[1])
131 o output to stdout
132
133 If supported, the 'q' option may be repeated to increase the effect.
134
135 --strip
136 Use with --itrace to strip out non-synthesized events.
137
138 -j, --jit
139 Process jitdump files by injecting the mmap records corresponding
140 to jitted functions. This option also generates the ELF images for
141 each jitted function found in the jitdumps files captured in the
142 input perf.data file. Use this option if you are monitoring
143 environment using JIT runtimes, such as Java, DART or V8.
144
145 -f, --force
146 Don’t complain, do it.
147
148 --vm-time-correlation[=OPTIONS]
149 Some architectures may capture AUX area data which contains
150 timestamps affected by virtualization. This option will update
151 those timestamps in place, to correlate with host timestamps. The
152 in-place update means that an output file is not specified, and
153 instead the input file is modified. The options are architecture
154 specific, except that they may start with "dry-run" which will
155 cause the file to be processed but without updating it. Currently
156 this option is supported only by Intel PT, refer perf-intel-pt(1)
157
158 --guest-data=<path>,<pid>[,<time offset>[,<time scale>]]
159 Insert events from a perf.data file recorded in a virtual machine
160 at the same time as the input perf.data file was recorded on the
161 host. The Process ID (PID) of the QEMU hypervisor process must be
162 provided, and the time offset and time scale (multiplier) will
163 likely be needed to convert guest time stamps into host time
164 stamps. For example, for x86 the TSC Offset and Multiplier could be
165 provided for a virtual machine using Linux command line option
166 no-kvmclock. Currently only mmap, mmap2, comm, task,
167 context_switch, ksymbol, and text_poke events are inserted, as well
168 as build ID information. The QEMU option -name debug-threads=on is
169 needed so that thread names can be used to determine which thread
170 is running which VCPU. Note libvirt seems to use this by default.
171 When using perf record in the guest, option --sample-identifier
172 should be used, and also --buildid-all and --switch-events may be
173 useful.
174
175 --guestmount=<path>
176 Guest OS root file system mount directory. Users mount guest OS
177 root directories under <path> by a specific filesystem access
178 method, typically, sshfs. For example, start 2 guest OS, one’s pid
179 is 8888 and the other’s is 9999:
180
181 $ mkdir ~/guestmount
182 $ cd ~/guestmount
183 $ sshfs -o allow_other,direct_io -p 5551 localhost:/ 8888/
184 $ sshfs -o allow_other,direct_io -p 5552 localhost:/ 9999/
185 $ perf inject --guestmount=~/guestmount
186
188 perf-record(1), perf-report(1), perf-archive(1), perf-intel-pt(1)
189
190
191
192perf 11/28/2023 PERF-INJECT(1)