1PERF-INJECT(1) perf Manual PERF-INJECT(1)
2
3
4
6 perf-inject - Filter to augment the events stream with additional
7 information
8
10 perf inject <options>
11
13 perf-inject reads a perf-record event stream and repipes it to stdout.
14 At any point the processing code can inject other events into the event
15 stream - in this case build-ids (-b option) are read and injected as
16 needed into the event stream.
17
18 Build-ids are just the first user of perf-inject - potentially anything
19 that needs userspace processing to augment the events stream with
20 additional information could make use of this facility.
21
23 -b, --build-ids
24 Inject build-ids of DSOs hit by samples into the output stream.
25 This means it needs to process all SAMPLE records to find the DSOs.
26
27 --buildid-all
28 Inject build-ids of all DSOs into the output stream regardless of
29 hits and skip SAMPLE processing.
30
31 --known-build-ids=
32 Override build-ids to inject using these comma-separated pairs of
33 build-id and path. Understands file://filename to read these pairs
34 from a file, which can be generated with perf buildid-list.
35
36 -v, --verbose
37 Be more verbose.
38
39 -i, --input=
40 Input file name. (default: stdin)
41
42 -o, --output=
43 Output file name. (default: stdout)
44
45 -s, --sched-stat
46 Merge sched_stat and sched_switch for getting events where and how
47 long tasks slept. sched_switch contains a callchain where a task
48 slept and sched_stat contains a timeslice how long a task slept.
49
50 -k, --vmlinux=<file>
51 vmlinux pathname
52
53 --ignore-vmlinux
54 Ignore vmlinux files.
55
56 --kallsyms=<file>
57 kallsyms pathname
58
59 --itrace
60 Decode Instruction Tracing data, replacing it with synthesized
61 events. Options are:
62
63 i synthesize instructions events
64 b synthesize branches events (branch misses for Arm SPE)
65 c synthesize branches events (calls only)
66 r synthesize branches events (returns only)
67 x synthesize transactions events
68 w synthesize ptwrite events
69 p synthesize power events (incl. PSB events for Intel PT)
70 o synthesize other events recorded due to the use
71 of aux-output (refer to perf record)
72 I synthesize interrupt or similar (asynchronous) events
73 (e.g. Intel PT Event Trace)
74 e synthesize error events
75 d create a debug log
76 f synthesize first level cache events
77 m synthesize last level cache events
78 M synthesize memory events
79 t synthesize TLB events
80 a synthesize remote access events
81 g synthesize a call chain (use with i or x)
82 G synthesize a call chain on existing event records
83 l synthesize last branch entries (use with i or x)
84 L synthesize last branch entries on existing event records
85 s skip initial number of events
86 q quicker (less detailed) decoding
87 A approximate IPC
88 Z prefer to ignore timestamps (so-called "timeless" decoding)
89
90 The default is all events i.e. the same as --itrace=ibxwpe,
91 except for perf script where it is --itrace=ce
92
93 In addition, the period (default 100000, except for perf script where it is 1)
94 for instructions events can be specified in units of:
95
96 i instructions
97 t ticks
98 ms milliseconds
99 us microseconds
100 ns nanoseconds (default)
101
102 Also the call chain size (default 16, max. 1024) for instructions or
103 transactions events can be specified.
104
105 Also the number of last branch entries (default 64, max. 1024) for
106 instructions or transactions events can be specified.
107
108 Similar to options g and l, size may also be specified for options G and L.
109 On x86, note that G and L work poorly when data has been recorded with
110 large PEBS. Refer linkperf:perf-intel-pt[1] man page for details.
111
112 It is also possible to skip events generated (instructions, branches, transactions,
113 ptwrite, power) at the beginning. This is useful to ignore initialization code.
114
115 --itrace=i0nss1000000
116
117 skips the first million instructions.
118
119 The 'e' option may be followed by flags which affect what errors will or
120 will not be reported. Each flag must be preceded by either '+' or '-'.
121 The flags are:
122 o overflow
123 l trace data lost
124
125 If supported, the 'd' option may be followed by flags which affect what
126 debug messages will or will not be logged. Each flag must be preceded
127 by either '+' or '-'. The flags are:
128 a all perf events
129 e output only on errors (size configurable - see linkperf:perf-config[1])
130 o output to stdout
131
132 If supported, the 'q' option may be repeated to increase the effect.
133
134 --strip
135 Use with --itrace to strip out non-synthesized events.
136
137 -j, --jit
138 Process jitdump files by injecting the mmap records corresponding
139 to jitted functions. This option also generates the ELF images for
140 each jitted function found in the jitdumps files captured in the
141 input perf.data file. Use this option if you are monitoring
142 environment using JIT runtimes, such as Java, DART or V8.
143
144 -f, --force
145 Don’t complain, do it.
146
147 --vm-time-correlation[=OPTIONS]
148 Some architectures may capture AUX area data which contains
149 timestamps affected by virtualization. This option will update
150 those timestamps in place, to correlate with host timestamps. The
151 in-place update means that an output file is not specified, and
152 instead the input file is modified. The options are architecture
153 specific, except that they may start with "dry-run" which will
154 cause the file to be processed but without updating it. Currently
155 this option is supported only by Intel PT, refer perf-intel-pt(1)
156
157 --guest-data=<path>,<pid>[,<time offset>[,<time scale>]]
158 Insert events from a perf.data file recorded in a virtual machine
159 at the same time as the input perf.data file was recorded on the
160 host. The Process ID (PID) of the QEMU hypervisor process must be
161 provided, and the time offset and time scale (multiplier) will
162 likely be needed to convert guest time stamps into host time
163 stamps. For example, for x86 the TSC Offset and Multiplier could be
164 provided for a virtual machine using Linux command line option
165 no-kvmclock. Currently only mmap, mmap2, comm, task,
166 context_switch, ksymbol, and text_poke events are inserted, as well
167 as build ID information. The QEMU option -name debug-threads=on is
168 needed so that thread names can be used to determine which thread
169 is running which VCPU. Note libvirt seems to use this by default.
170 When using perf record in the guest, option --sample-identifier
171 should be used, and also --buildid-all and --switch-events may be
172 useful.
173
174 --guestmount=<path>
175 Guest OS root file system mount directory. Users mount guest OS
176 root directories under <path> by a specific filesystem access
177 method, typically, sshfs. For example, start 2 guest OS, one’s pid
178 is 8888 and the other’s is 9999:
179
180 $ mkdir ~/guestmount
181 $ cd ~/guestmount
182 $ sshfs -o allow_other,direct_io -p 5551 localhost:/ 8888/
183 $ sshfs -o allow_other,direct_io -p 5552 localhost:/ 9999/
184 $ perf inject --guestmount=~/guestmount
185
187 perf-record(1), perf-report(1), perf-archive(1), perf-intel-pt(1)
188
189
190
191perf 01/12/2023 PERF-INJECT(1)