1PERF-LIST(1) perf Manual PERF-LIST(1)
2
3
4
6 perf-list - List all symbolic event types
7
9 perf list [--no-desc] [--long-desc]
10 [hw|sw|cache|tracepoint|pmu|sdt|metric|metricgroup|event_glob]
11
13 This command displays the symbolic event types which can be selected in
14 the various perf commands with the -e option.
15
17 -d, --desc
18 Print extra event descriptions. (default)
19
20 --no-desc
21 Don’t print descriptions.
22
23 -v, --long-desc
24 Print longer event descriptions.
25
26 --debug
27 Enable debugging output.
28
29 --details
30 Print how named events are resolved internally into perf events,
31 and also any extra expressions computed by perf stat.
32
33 --deprecated
34 Print deprecated events. By default the deprecated events are
35 hidden.
36
38 Events can optionally have a modifier by appending a colon and one or
39 more modifiers. Modifiers allow the user to restrict the events to be
40 counted. The following modifiers exist:
41
42 u - user-space counting
43 k - kernel counting
44 h - hypervisor counting
45 I - non idle counting
46 G - guest counting (in KVM guests)
47 H - host counting (not in KVM guests)
48 p - precise level
49 P - use maximum detected precise level
50 S - read sample value (PERF_SAMPLE_READ)
51 D - pin the event to the PMU
52 W - group is weak and will fallback to non-group if not schedulable,
53 e - group or event are exclusive and do not share the PMU
54
55 The p modifier can be used for specifying how precise the instruction
56 address should be. The p modifier can be specified multiple times:
57
58 0 - SAMPLE_IP can have arbitrary skid
59 1 - SAMPLE_IP must have constant skid
60 2 - SAMPLE_IP requested to have 0 skid
61 3 - SAMPLE_IP must have 0 skid, or uses randomization to avoid
62 sample shadowing effects.
63
64 For Intel systems precise event sampling is implemented with PEBS which
65 supports up to precise-level 2, and precise level 3 for some special
66 cases
67
68 On AMD systems it is implemented using IBS (up to precise-level 2). The
69 precise modifier works with event types 0x76 (cpu-cycles, CPU clocks
70 not halted) and 0xC1 (micro-ops retired). Both events map to IBS
71 execution sampling (IBS op) with the IBS Op Counter Control bit
72 (IbsOpCntCtl) set respectively (see AMD64 Architecture Programmer’s
73 Manual Volume 2: System Programming, 13.3 Instruction-Based Sampling).
74 Examples to use IBS:
75
76 perf record -a -e cpu-cycles:p ... # use ibs op counting cycles
77 perf record -a -e r076:p ... # same as -e cpu-cycles:p
78 perf record -a -e r0C1:p ... # use ibs op counting micro-ops
79
81 Even when an event is not available in a symbolic form within perf
82 right now, it can be encoded in a per processor specific way.
83
84 For instance For x86 CPUs NNN represents the raw register encoding with
85 the layout of IA32_PERFEVTSELx MSRs (see [Intel® 64 and IA-32
86 Architectures Software Developer’s Manual Volume 3B: System Programming
87 Guide] Figure 30-1 Layout of IA32_PERFEVTSELx MSRs) or AMD’s
88 PerfEvtSeln (see [AMD64 Architecture Programmer’s Manual Volume 2:
89 System Programming], Page 344, Figure 13-7 Performance Event-Select
90 Register (PerfEvtSeln)).
91
92 Note: Only the following bit fields can be set in x86 counter
93 registers: event, umask, edge, inv, cmask. Esp. guest/host only and
94 OS/user mode flags must be setup using EVENT MODIFIERS.
95
96 Example:
97
98 If the Intel docs for a QM720 Core i7 describe an event as:
99
100 Event Umask Event Mask
101 Num. Value Mnemonic Description Comment
102
103 A8H 01H LSD.UOPS Counts the number of micro-ops Use cmask=1 and
104 delivered by loop stream detector invert to count
105 cycles
106
107 raw encoding of 0x1A8 can be used:
108
109 perf stat -e r1a8 -a sleep 1
110 perf record -e r1a8 ...
111
112 It’s also possible to use pmu syntax:
113
114 perf record -e r1a8 -a sleep 1
115 perf record -e cpu/r1a8/ ...
116 perf record -e cpu/r0x1a8/ ...
117
118 You should refer to the processor specific documentation for getting
119 these details. Some of them are referenced in the SEE ALSO section
120 below.
121
123 perf also supports an extended syntax for specifying raw parameters to
124 PMUs. Using this typically requires looking up the specific event in
125 the CPU vendor specific documentation.
126
127 The available PMUs and their raw parameters can be listed with
128
129 ls /sys/devices/*/format
130
131 For example the raw event "LSD.UOPS" core pmu event above could be
132 specified as
133
134 perf stat -e cpu/event=0xa8,umask=0x1,name=LSD.UOPS_CYCLES,cmask=0x1/ ...
135
136 or using extended name syntax
137
138 perf stat -e cpu/event=0xa8,umask=0x1,cmask=0x1,name=\'LSD.UOPS_CYCLES:cmask=0x1\'/ ...
139
141 Some PMUs are not associated with a core, but with a whole CPU socket.
142 Events on these PMUs generally cannot be sampled, but only counted
143 globally with perf stat -a. They can be bound to one logical CPU, but
144 will measure all the CPUs in the same socket.
145
146 This example measures memory bandwidth every second on the first memory
147 controller on socket 0 of a Intel Xeon system
148
149 perf stat -C 0 -a uncore_imc_0/cas_count_read/,uncore_imc_0/cas_count_write/ -I 1000 ...
150
151 Each memory controller has its own PMU. Measuring the complete system
152 bandwidth would require specifying all imc PMUs (see perf list output),
153 and adding the values together. To simplify creation of multiple
154 events, prefix and glob matching is supported in the PMU name, and the
155 prefix uncore_ is also ignored when performing the match. So the
156 command above can be expanded to all memory controllers by using the
157 syntaxes:
158
159 perf stat -C 0 -a imc/cas_count_read/,imc/cas_count_write/ -I 1000 ...
160 perf stat -C 0 -a *imc*/cas_count_read/,*imc*/cas_count_write/ -I 1000 ...
161
162 This example measures the combined core power every second
163
164 perf stat -I 1000 -e power/energy-cores/ -a
165
167 For non root users generally only context switched PMU events are
168 available. This is normally only the events in the cpu PMU, the
169 predefined events like cycles and instructions and some software
170 events.
171
172 Other PMUs and global measurements are normally root only. Some event
173 qualifiers, such as "any", are also root only.
174
175 This can be overridden by setting the kernel.perf_event_paranoid sysctl
176 to -1, which allows non root to use these events.
177
178 For accessing trace point events perf needs to have read access to
179 /sys/kernel/debug/tracing, even when perf_event_paranoid is in a
180 relaxed setting.
181
183 Some PMUs control advanced hardware tracing capabilities, such as Intel
184 PT, that allows low overhead execution tracing. These are described in
185 a separate intel-pt.txt document.
186
188 Some pmu events listed by perf-list will be displayed with ? in them.
189 For example:
190
191 hv_gpci/dtbp_ptitc,phys_processor_idx=?/
192
193 This means that when provided as an event, a value for ? must also be
194 supplied. For example:
195
196 perf stat -C 0 -e 'hv_gpci/dtbp_ptitc,phys_processor_idx=0x2/' ...
197
198 EVENT QUALIFIERS:
199
200 It is also possible to add extra qualifiers to an event:
201
202 percore:
203
204 Sums up the event counts for all hardware threads in a core, e.g.:
205
206 perf stat -e cpu/event=0,umask=0x3,percore=1/
207
209 Perf supports time based multiplexing of events, when the number of
210 events active exceeds the number of hardware performance counters.
211 Multiplexing can cause measurement errors when the workload changes its
212 execution profile.
213
214 When metrics are computed using formulas from event counts, it is
215 useful to ensure some events are always measured together as a group to
216 minimize multiplexing errors. Event groups can be specified using { }.
217
218 perf stat -e '{instructions,cycles}' ...
219
220 The number of available performance counters depend on the CPU. A group
221 cannot contain more events than available counters. For example Intel
222 Core CPUs typically have four generic performance counters for the
223 core, plus three fixed counters for instructions, cycles and
224 ref-cycles. Some special events have restrictions on which counter they
225 can schedule, and may not support multiple instances in a single group.
226 When too many events are specified in the group some of them will not
227 be measured.
228
229 Globally pinned events can limit the number of counters available for
230 other groups. On x86 systems, the NMI watchdog pins a counter by
231 default. The nmi watchdog can be disabled as root with
232
233 echo 0 > /proc/sys/kernel/nmi_watchdog
234
235 Events from multiple different PMUs cannot be mixed in a group, with
236 some exceptions for software events.
237
239 perf also supports group leader sampling using the :S specifier.
240
241 perf record -e '{cycles,instructions}:S' ...
242 perf report --group
243
244 Normally all events in an event group sample, but with :S only the
245 first event (the leader) samples, and it only reads the values of the
246 other events in the group.
247
248 However, in the case AUX area events (e.g. Intel PT or CoreSight), the
249 AUX area event must be the leader, so then the second event samples,
250 not the first.
251
253 Without options all known events will be listed.
254
255 To limit the list use:
256
257 1. hw or hardware to list hardware events such as cache-misses, etc.
258
259 2. sw or software to list software events such as context switches,
260 etc.
261
262 3. cache or hwcache to list hardware cache events such as
263 L1-dcache-loads, etc.
264
265 4. tracepoint to list all tracepoint events, alternatively use
266 subsys_glob:event_glob to filter by tracepoint subsystems such as
267 sched, block, etc.
268
269 5. pmu to print the kernel supplied PMU events.
270
271 6. sdt to list all Statically Defined Tracepoint events.
272
273 7. metric to list metrics
274
275 8. metricgroup to list metricgroups with metrics.
276
277 9. If none of the above is matched, it will apply the supplied glob to
278 all events, printing the ones that match.
279
280 10. As a last resort, it will do a substring search in all event names.
281
282 One or more types can be used at the same time, listing the events for
283 the types specified.
284
285 Support raw format:
286
287 1. --raw-dump, shows the raw-dump of all the events.
288
289 2. --raw-dump [hw|sw|cache|tracepoint|pmu|event_glob], shows the
290 raw-dump of a certain kind of events.
291
293 perf-stat(1), perf-top(1), perf-record(1), Intel® 64 and IA-32
294 Architectures Software Developer’s Manual Volume 3B: System Programming
295 Guide[1], AMD64 Architecture Programmer’s Manual Volume 2: System
296 Programming[2]
297
299 1. Intel® 64 and IA-32 Architectures Software Developer’s Manual
300 Volume 3B: System Programming Guide
301 http://www.intel.com/sdm/
302
303 2. AMD64 Architecture Programmer’s Manual Volume 2: System Programming
304 http://support.amd.com/us/Processor_TechDocs/24593_APM_v2.pdf
305
306
307
308perf 06/03/2021 PERF-LIST(1)