1PERF-STAT(1) perf Manual PERF-STAT(1)
2
3
4
6 perf-stat - Run a command and gather performance counter statistics
7
9 perf stat [-e <EVENT> | --event=EVENT] [-a] <command>
10 perf stat [-e <EVENT> | --event=EVENT] [-a] — <command> [<options>]
11
13 This command runs a command and gathers performance counter statistics
14 from it.
15
17 <command>...
18 Any command you can specify in a shell.
19
20 -e, --event=
21 Select the PMU event. Selection can be:
22
23 · a symbolic event name (use perf list to list all events)
24
25 · a raw PMU event (eventsel+umask) in the form of rNNN where NNN
26 is a hexadecimal event descriptor.
27
28 · a symbolically formed event like pmu/param1=0x3,param2/ where
29 param1 and param2 are defined as formats for the PMU in
30 /sys/bus/event_sources/devices/<pmu>/format/*
31
32 · a symbolically formed event like
33 pmu/config=M,config1=N,config2=K/ where M, N, K are numbers (in
34 decimal, hex, octal format). Acceptable values for each of
35 config, config1 and config2 parameters are defined by
36 corresponding entries in
37 /sys/bus/event_sources/devices/<pmu>/format/*
38
39 -i, --no-inherit
40 child tasks do not inherit counters
41
42 -p, --pid=<pid>
43 stat events on existing process id (comma separated list)
44
45 -t, --tid=<tid>
46 stat events on existing thread id (comma separated list)
47
48 -a, --all-cpus
49 system-wide collection from all CPUs
50
51 -c, --scale
52 scale/normalize counter values
53
54 -r, --repeat=<n>
55 repeat command and print average + stddev (max: 100). 0 means
56 forever.
57
58 -B, --big-num
59 print large numbers with thousands´ separators according to locale
60
61 -C, --cpu=
62 Count only on the list of CPUs provided. Multiple CPUs can be
63 provided as a comma-separated list with no space: 0,1. Ranges of
64 CPUs are specified with -: 0-2. In per-thread mode, this option is
65 ignored. The -a option is still necessary to activate system-wide
66 monitoring. Default is to count on all CPUs.
67
68 -A, --no-aggr
69 Do not aggregate counts across all monitored CPUs in system-wide
70 mode (-a). This option is only valid in system-wide mode.
71
72 -n, --null
73 null run - don’t start any counters
74
75 -v, --verbose
76 be more verbose (show counter open errors, etc)
77
78 -x SEP, --field-separator SEP
79 print counts using a CSV-style output to make it easy to import
80 directly into spreadsheets. Columns are separated by the string
81 specified in SEP.
82
83 -G name, --cgroup name
84 monitor only in the container (cgroup) called "name". This option
85 is available only in per-cpu mode. The cgroup filesystem must be
86 mounted. All threads belonging to container "name" are monitored
87 when they run on the monitored CPUs. Multiple cgroups can be
88 provided. Each cgroup is applied to the corresponding event, i.e.,
89 first cgroup to first event, second cgroup to second event and so
90 on. It is possible to provide an empty cgroup (monitor all the
91 time) using, e.g., -G foo,,bar. Cgroups must have corresponding
92 events, i.e., they always refer to events defined earlier on the
93 command line.
94
95 -o file, --output file
96 Print the output into the designated file.
97
98 --append
99 Append to the output file designated with the -o option. Ignored if
100 -o is not specified.
101
102 --log-fd
103 Log output to fd, instead of stderr. Complementary to --output, and
104 mutually exclusive with it. --append may be used here. Examples:
105 3>results perf stat --log-fd 3 — $cmd 3>>results perf stat
106 --log-fd 3 --append — $cmd
107
108 --pre, --post
109 Pre and post measurement hooks, e.g.:
110
111 perf stat --repeat 10 --null --sync --pre make -s
112 O=defconfig-build/clean — make -s -j64 O=defconfig-build/ bzImage
113
114 -I msecs, --interval-print msecs
115 Print count deltas every N milliseconds (minimum: 100ms) example:
116 perf stat -I 1000 -e cycles -a sleep 5
117
118 --per-socket
119 Aggregate counts per processor socket for system-wide mode
120 measurements. This is a useful mode to detect imbalance between
121 sockets. To enable this mode, use --per-socket in addition to -a.
122 (system-wide). The output includes the socket number and the number
123 of online processors on that socket. This is useful to gauge the
124 amount of aggregation.
125
126 --per-core
127 Aggregate counts per physical processor for system-wide mode
128 measurements. This is a useful mode to detect imbalance between
129 physical cores. To enable this mode, use --per-core in addition to
130 -a. (system-wide). The output includes the core number and the
131 number of online logical processors on that physical processor.
132
133 -D msecs, --delay msecs
134 After starting the program, wait msecs before measuring. This is
135 useful to filter out the startup phase of the program, which is
136 often very different.
137
138 -T, --transaction
139 Print statistics of transactional execution if supported.
140
142 $ perf stat — make -j
143
144 Performance counter stats for ´make -j´:
145
146 8117.370256 task clock ticks # 11.281 CPU utilization factor
147 678 context switches # 0.000 M/sec
148 133 CPU migrations # 0.000 M/sec
149 235724 pagefaults # 0.029 M/sec
150 24821162526 CPU cycles # 3057.784 M/sec
151 18687303457 instructions # 2302.138 M/sec
152 172158895 cache references # 21.209 M/sec
153 27075259 cache misses # 3.335 M/sec
154
155 Wall-clock time elapsed: 719.554352 msecs
156
158 perf-top(1), perf-list(1)
159
160
161
162perf 06/18/2019 PERF-STAT(1)