perf-inject(1)

1PERF-INJECT(1)                    perf Manual                   PERF-INJECT(1)
2
3
4

NAME

6       perf-inject - Filter to augment the events stream with additional
7       information
8

SYNOPSIS

10       perf inject <options>
11

DESCRIPTION

13       perf-inject reads a perf-record event stream and repipes it to stdout.
14       At any point the processing code can inject other events into the event
15       stream - in this case build-ids (-b option) are read and injected as
16       needed into the event stream.
17
18       Build-ids are just the first user of perf-inject - potentially anything
19       that needs userspace processing to augment the events stream with
20       additional information could make use of this facility.
21

OPTIONS

23       -b, --build-ids
24           Inject build-ids of DSOs hit by samples into the output stream.
25           This means it needs to process all SAMPLE records to find the DSOs.
26
27       --buildid-all
28           Inject build-ids of all DSOs into the output stream regardless of
29           hits and skip SAMPLE processing.
30
31       --known-build-ids=
32           Override build-ids to inject using these comma-separated pairs of
33           build-id and path. Understands file://filename to read these pairs
34           from a file, which can be generated with perf buildid-list.
35
36       -v, --verbose
37           Be more verbose.
38
39       -i, --input=
40           Input file name. (default: stdin)
41
42       -o, --output=
43           Output file name. (default: stdout)
44
45       -s, --sched-stat
46           Merge sched_stat and sched_switch for getting events where and how
47           long tasks slept. sched_switch contains a callchain where a task
48           slept and sched_stat contains a timeslice how long a task slept.
49
50       -k, --vmlinux=<file>
51           vmlinux pathname
52
53       --ignore-vmlinux
54           Ignore vmlinux files.
55
56       --kallsyms=<file>
57           kallsyms pathname
58
59       --itrace
60           Decode Instruction Tracing data, replacing it with synthesized
61           events. Options are:
62
63               i       synthesize instructions events
64               y       synthesize cycles events
65               b       synthesize branches events (branch misses for Arm SPE)
66               c       synthesize branches events (calls only)
67               r       synthesize branches events (returns only)
68               x       synthesize transactions events
69               w       synthesize ptwrite events
70               p       synthesize power events (incl. PSB events for Intel PT)
71               o       synthesize other events recorded due to the use
72                       of aux-output (refer to perf record)
73               I       synthesize interrupt or similar (asynchronous) events
74                       (e.g. Intel PT Event Trace)
75               e       synthesize error events
76               d       create a debug log
77               f       synthesize first level cache events
78               m       synthesize last level cache events
79               M       synthesize memory events
80               t       synthesize TLB events
81               a       synthesize remote access events
82               g       synthesize a call chain (use with i or x)
83               G       synthesize a call chain on existing event records
84               l       synthesize last branch entries (use with i or x)
85               L       synthesize last branch entries on existing event records
86               s       skip initial number of events
87               q       quicker (less detailed) decoding
88               A       approximate IPC
89               Z       prefer to ignore timestamps (so-called "timeless" decoding)
90
91               The default is all events i.e. the same as --itrace=iybxwpe,
92               except for perf script where it is --itrace=ce
93
94               In addition, the period (default 100000, except for perf script where it is 1)
95               for instructions events can be specified in units of:
96
97               i       instructions
98               t       ticks
99               ms      milliseconds
100               us      microseconds
101               ns      nanoseconds (default)
102
103               Also the call chain size (default 16, max. 1024) for instructions or
104               transactions events can be specified.
105
106               Also the number of last branch entries (default 64, max. 1024) for
107               instructions or transactions events can be specified.
108
109               Similar to options g and l, size may also be specified for options G and L.
110               On x86, note that G and L work poorly when data has been recorded with
111               large PEBS. Refer linkperf:perf-intel-pt[1] man page for details.
112
113               It is also possible to skip events generated (instructions, branches, transactions,
114               ptwrite, power) at the beginning. This is useful to ignore initialization code.
115
116               --itrace=i0nss1000000
117
118               skips the first million instructions.
119
120               The 'e' option may be followed by flags which affect what errors will or
121               will not be reported. Each flag must be preceded by either '+' or '-'.
122               The flags are:
123                       o       overflow
124                       l       trace data lost
125
126               If supported, the 'd' option may be followed by flags which affect what
127               debug messages will or will not be logged. Each flag must be preceded
128               by either '+' or '-'. The flags are:
129                       a       all perf events
130                       e       output only on errors (size configurable - see linkperf:perf-config[1])
131                       o       output to stdout
132
133               If supported, the 'q' option may be repeated to increase the effect.
134
135       --strip
136           Use with --itrace to strip out non-synthesized events.
137
138       -j, --jit
139           Process jitdump files by injecting the mmap records corresponding
140           to jitted functions. This option also generates the ELF images for
141           each jitted function found in the jitdumps files captured in the
142           input perf.data file. Use this option if you are monitoring
143           environment using JIT runtimes, such as Java, DART or V8.
144
145       -f, --force
146           Don’t complain, do it.
147
148       --vm-time-correlation[=OPTIONS]
149           Some architectures may capture AUX area data which contains
150           timestamps affected by virtualization. This option will update
151           those timestamps in place, to correlate with host timestamps. The
152           in-place update means that an output file is not specified, and
153           instead the input file is modified. The options are architecture
154           specific, except that they may start with "dry-run" which will
155           cause the file to be processed but without updating it. Currently
156           this option is supported only by Intel PT, refer perf-intel-pt(1)
157
158       --guest-data=<path>,<pid>[,<time offset>[,<time scale>]]
159           Insert events from a perf.data file recorded in a virtual machine
160           at the same time as the input perf.data file was recorded on the
161           host. The Process ID (PID) of the QEMU hypervisor process must be
162           provided, and the time offset and time scale (multiplier) will
163           likely be needed to convert guest time stamps into host time
164           stamps. For example, for x86 the TSC Offset and Multiplier could be
165           provided for a virtual machine using Linux command line option
166           no-kvmclock. Currently only mmap, mmap2, comm, task,
167           context_switch, ksymbol, and text_poke events are inserted, as well
168           as build ID information. The QEMU option -name debug-threads=on is
169           needed so that thread names can be used to determine which thread
170           is running which VCPU. Note libvirt seems to use this by default.
171           When using perf record in the guest, option --sample-identifier
172           should be used, and also --buildid-all and --switch-events may be
173           useful.
174
175       --guestmount=<path>
176           Guest OS root file system mount directory. Users mount guest OS
177           root directories under <path> by a specific filesystem access
178           method, typically, sshfs. For example, start 2 guest OS, one’s pid
179           is 8888 and the other’s is 9999:
180
181               $ mkdir ~/guestmount
182               $ cd ~/guestmount
183               $ sshfs -o allow_other,direct_io -p 5551 localhost:/ 8888/
184               $ sshfs -o allow_other,direct_io -p 5552 localhost:/ 9999/
185               $ perf inject --guestmount=~/guestmount
186

NAME

SYNOPSIS

DESCRIPTION

OPTIONS

SEE ALSO