1PERF-LIST(1) perf Manual PERF-LIST(1)
2
3
4
6 perf-list - List all symbolic event types
7
9 perf list [--no-desc] [--long-desc]
10 [hw|sw|cache|tracepoint|pmu|sdt|metric|metricgroup|event_glob]
11
13 This command displays the symbolic event types which can be selected in
14 the various perf commands with the -e option.
15
17 -d, --desc
18 Print extra event descriptions. (default)
19
20 --no-desc
21 Don’t print descriptions.
22
23 -v, --long-desc
24 Print longer event descriptions.
25
26 --debug
27 Enable debugging output.
28
29 --details
30 Print how named events are resolved internally into perf events,
31 and also any extra expressions computed by perf stat.
32
33 --deprecated
34 Print deprecated events. By default the deprecated events are
35 hidden.
36
37 --cputype
38 Print events applying cpu with this type for hybrid platform (e.g.
39 --cputype core or --cputype atom)
40
42 Events can optionally have a modifier by appending a colon and one or
43 more modifiers. Modifiers allow the user to restrict the events to be
44 counted. The following modifiers exist:
45
46 u - user-space counting
47 k - kernel counting
48 h - hypervisor counting
49 I - non idle counting
50 G - guest counting (in KVM guests)
51 H - host counting (not in KVM guests)
52 p - precise level
53 P - use maximum detected precise level
54 S - read sample value (PERF_SAMPLE_READ)
55 D - pin the event to the PMU
56 W - group is weak and will fallback to non-group if not schedulable,
57 e - group or event are exclusive and do not share the PMU
58
59 The p modifier can be used for specifying how precise the instruction
60 address should be. The p modifier can be specified multiple times:
61
62 0 - SAMPLE_IP can have arbitrary skid
63 1 - SAMPLE_IP must have constant skid
64 2 - SAMPLE_IP requested to have 0 skid
65 3 - SAMPLE_IP must have 0 skid, or uses randomization to avoid
66 sample shadowing effects.
67
68 For Intel systems precise event sampling is implemented with PEBS which
69 supports up to precise-level 2, and precise level 3 for some special
70 cases
71
72 On AMD systems it is implemented using IBS (up to precise-level 2). The
73 precise modifier works with event types 0x76 (cpu-cycles, CPU clocks
74 not halted) and 0xC1 (micro-ops retired). Both events map to IBS
75 execution sampling (IBS op) with the IBS Op Counter Control bit
76 (IbsOpCntCtl) set respectively (see the Core Complex (CCX) → Processor
77 x86 Core → Instruction Based Sampling (IBS) section of the [AMD
78 Processor Programming Reference (PPR)] relevant to the family, model
79 and stepping of the processor being used).
80
81 Manual Volume 2: System Programming, 13.3 Instruction-Based Sampling).
82 Examples to use IBS:
83
84 perf record -a -e cpu-cycles:p ... # use ibs op counting cycles
85 perf record -a -e r076:p ... # same as -e cpu-cycles:p
86 perf record -a -e r0C1:p ... # use ibs op counting micro-ops
87
89 Even when an event is not available in a symbolic form within perf
90 right now, it can be encoded in a per processor specific way.
91
92 For instance on x86 CPUs, N is a hexadecimal value that represents the
93 raw register encoding with the layout of IA32_PERFEVTSELx MSRs (see
94 [Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume
95 3B: System Programming Guide] Figure 30-1 Layout of IA32_PERFEVTSELx
96 MSRs) or AMD’s PERF_CTL MSRs (see the Core Complex (CCX) → Processor
97 x86 Core → MSR Registers section of the [AMD Processor Programming
98 Reference (PPR)] relevant to the family, model and stepping of the
99 processor being used).
100
101 Note: Only the following bit fields can be set in x86 counter
102 registers: event, umask, edge, inv, cmask. Esp. guest/host only and
103 OS/user mode flags must be setup using EVENT MODIFIERS.
104
105 Example:
106
107 If the Intel docs for a QM720 Core i7 describe an event as:
108
109 Event Umask Event Mask
110 Num. Value Mnemonic Description Comment
111
112 A8H 01H LSD.UOPS Counts the number of micro-ops Use cmask=1 and
113 delivered by loop stream detector invert to count
114 cycles
115
116 raw encoding of 0x1A8 can be used:
117
118 perf stat -e r1a8 -a sleep 1
119 perf record -e r1a8 ...
120
121 It’s also possible to use pmu syntax:
122
123 perf record -e r1a8 -a sleep 1
124 perf record -e cpu/r1a8/ ...
125 perf record -e cpu/r0x1a8/ ...
126
127 Some processors, like those from AMD, support event codes and unit
128 masks larger than a byte. In such cases, the bits corresponding to the
129 event configuration parameters can be seen with:
130
131 cat /sys/bus/event_source/devices/<pmu>/format/<config>
132
133 Example:
134
135 If the AMD docs for an EPYC 7713 processor describe an event as:
136
137 Event Umask Event Mask
138 Num. Value Mnemonic Description
139
140 28FH 03H op_cache_hit_miss.op_cache_hit Counts Op Cache micro-tag
141 hit events.
142
143 raw encoding of 0x0328F cannot be used since the upper nibble of the
144 EventSelect bits have to be specified via bits 32-35 as can be seen
145 with:
146
147 cat /sys/bus/event_source/devices/cpu/format/event
148
149 raw encoding of 0x20000038F should be used instead:
150
151 perf stat -e r20000038f -a sleep 1
152 perf record -e r20000038f ...
153
154 It’s also possible to use pmu syntax:
155
156 perf record -e r20000038f -a sleep 1
157 perf record -e cpu/r20000038f/ ...
158 perf record -e cpu/r0x20000038f/ ...
159
160 You should refer to the processor specific documentation for getting
161 these details. Some of them are referenced in the SEE ALSO section
162 below.
163
165 perf also supports an extended syntax for specifying raw parameters to
166 PMUs. Using this typically requires looking up the specific event in
167 the CPU vendor specific documentation.
168
169 The available PMUs and their raw parameters can be listed with
170
171 ls /sys/devices/*/format
172
173 For example the raw event "LSD.UOPS" core pmu event above could be
174 specified as
175
176 perf stat -e cpu/event=0xa8,umask=0x1,name=LSD.UOPS_CYCLES,cmask=0x1/ ...
177
178 or using extended name syntax
179
180 perf stat -e cpu/event=0xa8,umask=0x1,cmask=0x1,name=\'LSD.UOPS_CYCLES:cmask=0x1\'/ ...
181
183 Some PMUs are not associated with a core, but with a whole CPU socket.
184 Events on these PMUs generally cannot be sampled, but only counted
185 globally with perf stat -a. They can be bound to one logical CPU, but
186 will measure all the CPUs in the same socket.
187
188 This example measures memory bandwidth every second on the first memory
189 controller on socket 0 of a Intel Xeon system
190
191 perf stat -C 0 -a uncore_imc_0/cas_count_read/,uncore_imc_0/cas_count_write/ -I 1000 ...
192
193 Each memory controller has its own PMU. Measuring the complete system
194 bandwidth would require specifying all imc PMUs (see perf list output),
195 and adding the values together. To simplify creation of multiple
196 events, prefix and glob matching is supported in the PMU name, and the
197 prefix uncore_ is also ignored when performing the match. So the
198 command above can be expanded to all memory controllers by using the
199 syntaxes:
200
201 perf stat -C 0 -a imc/cas_count_read/,imc/cas_count_write/ -I 1000 ...
202 perf stat -C 0 -a *imc*/cas_count_read/,*imc*/cas_count_write/ -I 1000 ...
203
204 This example measures the combined core power every second
205
206 perf stat -I 1000 -e power/energy-cores/ -a
207
209 For non root users generally only context switched PMU events are
210 available. This is normally only the events in the cpu PMU, the
211 predefined events like cycles and instructions and some software
212 events.
213
214 Other PMUs and global measurements are normally root only. Some event
215 qualifiers, such as "any", are also root only.
216
217 This can be overridden by setting the kernel.perf_event_paranoid sysctl
218 to -1, which allows non root to use these events.
219
220 For accessing trace point events perf needs to have read access to
221 /sys/kernel/debug/tracing, even when perf_event_paranoid is in a
222 relaxed setting.
223
225 Some PMUs control advanced hardware tracing capabilities, such as Intel
226 PT, that allows low overhead execution tracing. These are described in
227 a separate intel-pt.txt document.
228
230 Some pmu events listed by perf-list will be displayed with ? in them.
231 For example:
232
233 hv_gpci/dtbp_ptitc,phys_processor_idx=?/
234
235 This means that when provided as an event, a value for ? must also be
236 supplied. For example:
237
238 perf stat -C 0 -e 'hv_gpci/dtbp_ptitc,phys_processor_idx=0x2/' ...
239
240 EVENT QUALIFIERS:
241
242 It is also possible to add extra qualifiers to an event:
243
244 percore:
245
246 Sums up the event counts for all hardware threads in a core, e.g.:
247
248 perf stat -e cpu/event=0,umask=0x3,percore=1/
249
251 Perf supports time based multiplexing of events, when the number of
252 events active exceeds the number of hardware performance counters.
253 Multiplexing can cause measurement errors when the workload changes its
254 execution profile.
255
256 When metrics are computed using formulas from event counts, it is
257 useful to ensure some events are always measured together as a group to
258 minimize multiplexing errors. Event groups can be specified using { }.
259
260 perf stat -e '{instructions,cycles}' ...
261
262 The number of available performance counters depend on the CPU. A group
263 cannot contain more events than available counters. For example Intel
264 Core CPUs typically have four generic performance counters for the
265 core, plus three fixed counters for instructions, cycles and
266 ref-cycles. Some special events have restrictions on which counter they
267 can schedule, and may not support multiple instances in a single group.
268 When too many events are specified in the group some of them will not
269 be measured.
270
271 Globally pinned events can limit the number of counters available for
272 other groups. On x86 systems, the NMI watchdog pins a counter by
273 default. The nmi watchdog can be disabled as root with
274
275 echo 0 > /proc/sys/kernel/nmi_watchdog
276
277 Events from multiple different PMUs cannot be mixed in a group, with
278 some exceptions for software events.
279
281 perf also supports group leader sampling using the :S specifier.
282
283 perf record -e '{cycles,instructions}:S' ...
284 perf report --group
285
286 Normally all events in an event group sample, but with :S only the
287 first event (the leader) samples, and it only reads the values of the
288 other events in the group.
289
290 However, in the case AUX area events (e.g. Intel PT or CoreSight), the
291 AUX area event must be the leader, so then the second event samples,
292 not the first.
293
295 Without options all known events will be listed.
296
297 To limit the list use:
298
299 1. hw or hardware to list hardware events such as cache-misses, etc.
300
301 2. sw or software to list software events such as context switches,
302 etc.
303
304 3. cache or hwcache to list hardware cache events such as
305 L1-dcache-loads, etc.
306
307 4. tracepoint to list all tracepoint events, alternatively use
308 subsys_glob:event_glob to filter by tracepoint subsystems such as
309 sched, block, etc.
310
311 5. pmu to print the kernel supplied PMU events.
312
313 6. sdt to list all Statically Defined Tracepoint events.
314
315 7. metric to list metrics
316
317 8. metricgroup to list metricgroups with metrics.
318
319 9. If none of the above is matched, it will apply the supplied glob to
320 all events, printing the ones that match.
321
322 10. As a last resort, it will do a substring search in all event names.
323
324 One or more types can be used at the same time, listing the events for
325 the types specified.
326
327 Support raw format:
328
329 1. --raw-dump, shows the raw-dump of all the events.
330
331 2. --raw-dump [hw|sw|cache|tracepoint|pmu|event_glob], shows the
332 raw-dump of a certain kind of events.
333
335 perf-stat(1), perf-top(1), perf-record(1), Intel® 64 and IA-32
336 Architectures Software Developer’s Manual Volume 3B: System Programming
337 Guide[1], AMD Processor Programming Reference (PPR)[2]
338
340 1. Intel® 64 and IA-32 Architectures Software Developer’s Manual
341 Volume 3B: System Programming Guide
342 http://www.intel.com/sdm/
343
344 2. AMD Processor Programming Reference (PPR)
345 https://bugzilla.kernel.org/show_bug.cgi?id=206537
346
347
348
349perf 06/14/2022 PERF-LIST(1)