perf-trace(1)

1PERF-TRACE(1)                     perf Manual                    PERF-TRACE(1)
2
3
4

NAME

6       perf-trace - strace inspired tool
7

SYNOPSIS

9       perf trace
10       perf trace record
11

DESCRIPTION

13       This command will show the events associated with the target, initially
14       syscalls, but other system events like pagefaults, task lifetime
15       events, scheduling events, etc.
16
17       This is a live mode tool in addition to working with perf.data files
18       like the other perf tools. Files can be generated using the perf record
19       command but the session needs to include the raw_syscalls events (-e
20       raw_syscalls:*). Alternatively, perf trace record can be used as a
21       shortcut to automatically include the raw_syscalls events when writing
22       events to a file.
23
24       The following options apply to perf trace; options to perf trace record
25       are found in the perf record man page.
26

OPTIONS

28       -a, --all-cpus
29           System-wide collection from all CPUs.
30
31       -e, --expr, --event
32           List of syscalls and other perf events (tracepoints, HW cache
33           events, etc) to show. Globbing is supported, e.g.: "epoll_*",
34           "msg", etc. See perf list for a complete list of events. Prefixing
35           with ! shows all syscalls but the ones specified. You may need to
36           escape it.
37
38       -D msecs, --delay msecs
39           After starting the program, wait msecs before measuring. This is
40           useful to filter out the startup phase of the program, which is
41           often very different.
42
43       -o, --output=
44           Output file name.
45
46       -p, --pid=
47           Record events on existing process ID (comma separated list).
48
49       -t, --tid=
50           Record events on existing thread ID (comma separated list).
51
52       -u, --uid=
53           Record events in threads owned by uid. Name or number.
54
55       -G, --cgroup
56           Record events in threads in a cgroup.
57
58               Look for cgroups to set at the /sys/fs/cgroup/perf_event directory, then
59               remove the /sys/fs/cgroup/perf_event/ part and try:
60
61               perf trace -G A -e sched:*switch
62
63               Will set all raw_syscalls:sys_{enter,exit}, pgfault, vfs_getname, etc
64               _and_ sched:sched_switch to the 'A' cgroup, while:
65
66               perf trace -e sched:*switch -G A
67
68               will only set the sched:sched_switch event to the 'A' cgroup, all the
69               other events (raw_syscalls:sys_{enter,exit}, etc are left "without"
70               a cgroup (on the root cgroup, sys wide, etc).
71
72               Multiple cgroups:
73
74               perf trace -G A -e sched:*switch -G B
75
76               the syscall ones go to the 'A' cgroup, the sched:sched_switch goes
77               to the 'B' cgroup.
78
79       --filter-pids=
80           Filter out events for these pids and for trace itself (comma
81           separated list).
82
83       -v, --verbose=
84           Verbosity level.
85
86       --no-inherit
87           Child tasks do not inherit counters.
88
89       -m, --mmap-pages=
90           Number of mmap data pages (must be a power of two) or size
91           specification with appended unit character - B/K/M/G. The size is
92           rounded up to have nearest pages power of two value.
93
94       -C, --cpu
95           Collect samples only on the list of CPUs provided. Multiple CPUs
96           can be provided as a comma-separated list with no space: 0,1.
97           Ranges of CPUs are specified with -: 0-2. In per-thread mode with
98           inheritance mode on (default), Events are captured only when the
99           thread executes on the designated CPUs. Default is to monitor all
100           CPUs.
101
102       --duration
103           Show only events that had a duration greater than N.M ms.
104
105       --sched
106           Accrue thread runtime and provide a summary at the end of the
107           session.
108
109       --failure
110           Show only syscalls that failed, i.e. that returned < 0.
111
112       -i, --input
113           Process events from a given perf data file.
114
115       -T, --time
116           Print full timestamp rather time relative to first sample.
117
118       --comm
119           Show process COMM right beside its ID, on by default, disable with
120           --no-comm.
121
122       -s, --summary
123           Show only a summary of syscalls by thread with min, max, and
124           average times (in msec) and relative stddev.
125
126       -S, --with-summary
127           Show all syscalls followed by a summary by thread with min, max,
128           and average times (in msec) and relative stddev.
129
130       --tool_stats
131           Show tool stats such as number of times fd→pathname was discovered
132           thru hooking the open syscall return + vfs_getname or via reading
133           /proc/pid/fd, etc.
134
135       -f, --force
136           Don’t complain, do it.
137
138       -F=[all|min|maj], --pf=[all|min|maj]
139           Trace pagefaults. Optionally, you can specify whether you want
140           minor, major or all pagefaults. Default value is maj.
141
142       --syscalls
143           Trace system calls. This options is enabled by default, disable
144           with --no-syscalls.
145
146       --call-graph [mode,type,min[,limit],order[,key][,branch]]
147           Setup and enable call-graph (stack chain/backtrace) recording. See
148           --call-graph section in perf-record and perf-report man pages for
149           details. The ones that are most useful in perf trace are dwarf and
150           lbr, where available, try: perf trace --call-graph dwarf.
151
152               Using this will, for the root user, bump the value of --mmap-pages to 4
153               times the maximum for non-root users, based on the kernel.perf_event_mlock_kb
154               sysctl. This is done only if the user doesn't specify a --mmap-pages value.
155
156       --kernel-syscall-graph
157           Show the kernel callchains on the syscall exit path.
158
159       --max-stack
160           Set the stack depth limit when parsing the callchain, anything
161           beyond the specified depth will be ignored. Note that at this point
162           this is just about the presentation part, i.e. the kernel is still
163           not limiting, the overhead of callchains needs to be set via the
164           knobs in --call-graph dwarf.
165
166               Implies '--call-graph dwarf' when --call-graph not present on the
167               command line, on systems where DWARF unwinding was built in.
168
169               Default: /proc/sys/kernel/perf_event_max_stack when present for
170                        live sessions (without --input/-i), 127 otherwise.
171
172       --min-stack
173           Set the stack depth limit when parsing the callchain, anything
174           below the specified depth will be ignored. Disabled by default.
175
176               Implies '--call-graph dwarf' when --call-graph not present on the
177               command line, on systems where DWARF unwinding was built in.
178
179       --print-sample
180           Print the PERF_RECORD_SAMPLE PERF_SAMPLE_ info for the
181           raw_syscalls:sys_{enter,exit} tracepoints, for debugging.
182
183       --proc-map-timeout
184           When processing pre-existing threads /proc/XXX/mmap, it may take a
185           long time, because the file may be huge. A time out is needed in
186           such cases. This option sets the time out limit. The default value
187           is 500 ms.
188

PAGEFAULTS

190       When tracing pagefaults, the format of the trace is as follows:
191
192       <min|maj>fault [<ip.symbol>+<ip.offset>] ⇒ <addr.dso@addr.offset[1]>
193       (<map type><addr level>).
194
195       ·   min/maj indicates whether fault event is minor or major;
196
197       ·   ip.symbol shows symbol for instruction pointer (the code that
198           generated the fault); if no debug symbols available, perf trace
199           will print raw IP;
200
201       ·   addr.dso shows DSO for the faulted address;
202
203       ·   map type is either d for non-executable maps or x for executable
204           maps;
205
206       ·   addr level is either k for kernel dso or .  for user dso.
207
208       For symbols resolution you may need to install debugging symbols.
209
210       Please be aware that duration is currently always 0 and doesn’t reflect
211       actual time it took for fault to be handled!
212
213       When --verbose specified, perf trace tries to print all available
214       information for both IP and fault address in the form of
215       dso@symbol[2]+offset.
216

EXAMPLES

218       Trace only major pagefaults:
219
220           $ perf trace --no-syscalls -F
221
222       Trace syscalls, major and minor pagefaults:
223
224           $ perf trace -F all
225
226           1416.547 ( 0.000 ms): python/20235 majfault [CRYPTO_push_info_+0x0] => /lib/x86_64-linux-gnu/libcrypto.so.1.0.0@0x61be0 (x.)
227
228           As you can see, there was major pagefault in python process, from
229           CRYPTO_push_info_ routine which faulted somewhere in libcrypto.so.
230

NOTES

235        1. addr.dso@addr.offset
236           mailto:addr.dso@addr.offset
237
238        2. dso@symbol
239           mailto:dso@symbol
240
241
242
243perf                              09/24/2019                     PERF-TRACE(1)