perf-list(1)

1PERF-LIST(1)                      perf Manual                     PERF-LIST(1)
2
3
4

NAME

6       perf-list - List all symbolic event types
7

SYNOPSIS

9       perf list [--no-desc] [--long-desc]
10                   [hw|sw|cache|tracepoint|pmu|sdt|metric|metricgroup|event_glob]
11

DESCRIPTION

13       This command displays the symbolic event types which can be selected in
14       the various perf commands with the -e option.
15

OPTIONS

17       -d, --desc
18           Print extra event descriptions. (default)
19
20       --no-desc
21           Don’t print descriptions.
22
23       -v, --long-desc
24           Print longer event descriptions.
25
26       --debug
27           Enable debugging output.
28
29       --details
30           Print how named events are resolved internally into perf events,
31           and also any extra expressions computed by perf stat.
32
33       --deprecated
34           Print deprecated events. By default the deprecated events are
35           hidden.
36
37       --unit
38           Print PMU events and metrics limited to the specific PMU name.
39           (e.g. --unit cpu, --unit msr, --unit cpu_core, --unit cpu_atom)
40
41       -j, --json
42           Output in JSON format.
43

EVENT MODIFIERS

45       Events can optionally have a modifier by appending a colon and one or
46       more modifiers. Modifiers allow the user to restrict the events to be
47       counted. The following modifiers exist:
48
49           u - user-space counting
50           k - kernel counting
51           h - hypervisor counting
52           I - non idle counting
53           G - guest counting (in KVM guests)
54           H - host counting (not in KVM guests)
55           p - precise level
56           P - use maximum detected precise level
57           S - read sample value (PERF_SAMPLE_READ)
58           D - pin the event to the PMU
59           W - group is weak and will fallback to non-group if not schedulable,
60           e - group or event are exclusive and do not share the PMU
61
62       The p modifier can be used for specifying how precise the instruction
63       address should be. The p modifier can be specified multiple times:
64
65           0 - SAMPLE_IP can have arbitrary skid
66           1 - SAMPLE_IP must have constant skid
67           2 - SAMPLE_IP requested to have 0 skid
68           3 - SAMPLE_IP must have 0 skid, or uses randomization to avoid
69               sample shadowing effects.
70
71       For Intel systems precise event sampling is implemented with PEBS which
72       supports up to precise-level 2, and precise level 3 for some special
73       cases
74
75       On AMD systems it is implemented using IBS (up to precise-level 2). The
76       precise modifier works with event types 0x76 (cpu-cycles, CPU clocks
77       not halted) and 0xC1 (micro-ops retired). Both events map to IBS
78       execution sampling (IBS op) with the IBS Op Counter Control bit
79       (IbsOpCntCtl) set respectively (see the Core Complex (CCX) → Processor
80       x86 Core → Instruction Based Sampling (IBS) section of the [AMD
81       Processor Programming Reference (PPR)] relevant to the family, model
82       and stepping of the processor being used).
83
84       Manual Volume 2: System Programming, 13.3 Instruction-Based Sampling).
85       Examples to use IBS:
86
87           perf record -a -e cpu-cycles:p ...    # use ibs op counting cycles
88           perf record -a -e r076:p ...          # same as -e cpu-cycles:p
89           perf record -a -e r0C1:p ...          # use ibs op counting micro-ops
90

RAW HARDWARE EVENT DESCRIPTOR

92       Even when an event is not available in a symbolic form within perf
93       right now, it can be encoded in a per processor specific way.
94
95       For instance on x86 CPUs, N is a hexadecimal value that represents the
96       raw register encoding with the layout of IA32_PERFEVTSELx MSRs (see
97       [Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume
98       3B: System Programming Guide] Figure 30-1 Layout of IA32_PERFEVTSELx
99       MSRs) or AMD’s PERF_CTL MSRs (see the Core Complex (CCX) → Processor
100       x86 Core → MSR Registers section of the [AMD Processor Programming
101       Reference (PPR)] relevant to the family, model and stepping of the
102       processor being used).
103
104       Note: Only the following bit fields can be set in x86 counter
105       registers: event, umask, edge, inv, cmask. Esp. guest/host only and
106       OS/user mode flags must be setup using EVENT MODIFIERS.
107
108       Example:
109
110       If the Intel docs for a QM720 Core i7 describe an event as:
111
112           Event  Umask  Event Mask
113           Num.   Value  Mnemonic    Description                        Comment
114
115           A8H      01H  LSD.UOPS    Counts the number of micro-ops     Use cmask=1 and
116                                     delivered by loop stream detector  invert to count
117                                                                        cycles
118
119       raw encoding of 0x1A8 can be used:
120
121           perf stat -e r1a8 -a sleep 1
122           perf record -e r1a8 ...
123
124       It’s also possible to use pmu syntax:
125
126           perf record -e r1a8 -a sleep 1
127           perf record -e cpu/r1a8/ ...
128           perf record -e cpu/r0x1a8/ ...
129
130       Some processors, like those from AMD, support event codes and unit
131       masks larger than a byte. In such cases, the bits corresponding to the
132       event configuration parameters can be seen with:
133
134           cat /sys/bus/event_source/devices/<pmu>/format/<config>
135
136       Example:
137
138       If the AMD docs for an EPYC 7713 processor describe an event as:
139
140           Event  Umask  Event Mask
141           Num.   Value  Mnemonic                        Description
142
143           28FH     03H  op_cache_hit_miss.op_cache_hit  Counts Op Cache micro-tag
144                                                         hit events.
145
146       raw encoding of 0x0328F cannot be used since the upper nibble of the
147       EventSelect bits have to be specified via bits 32-35 as can be seen
148       with:
149
150           cat /sys/bus/event_source/devices/cpu/format/event
151
152       raw encoding of 0x20000038F should be used instead:
153
154           perf stat -e r20000038f -a sleep 1
155           perf record -e r20000038f ...
156
157       It’s also possible to use pmu syntax:
158
159           perf record -e r20000038f -a sleep 1
160           perf record -e cpu/r20000038f/ ...
161           perf record -e cpu/r0x20000038f/ ...
162
163       You should refer to the processor specific documentation for getting
164       these details. Some of them are referenced in the SEE ALSO section
165       below.
166

ARBITRARY PMUS

168       perf also supports an extended syntax for specifying raw parameters to
169       PMUs. Using this typically requires looking up the specific event in
170       the CPU vendor specific documentation.
171
172       The available PMUs and their raw parameters can be listed with
173
174           ls /sys/devices/*/format
175
176       For example the raw event "LSD.UOPS" core pmu event above could be
177       specified as
178
179           perf stat -e cpu/event=0xa8,umask=0x1,name=LSD.UOPS_CYCLES,cmask=0x1/ ...
180
181           or using extended name syntax
182
183           perf stat -e cpu/event=0xa8,umask=0x1,cmask=0x1,name=\'LSD.UOPS_CYCLES:cmask=0x1\'/ ...
184

PER SOCKET PMUS

186       Some PMUs are not associated with a core, but with a whole CPU socket.
187       Events on these PMUs generally cannot be sampled, but only counted
188       globally with perf stat -a. They can be bound to one logical CPU, but
189       will measure all the CPUs in the same socket.
190
191       This example measures memory bandwidth every second on the first memory
192       controller on socket 0 of a Intel Xeon system
193
194           perf stat -C 0 -a uncore_imc_0/cas_count_read/,uncore_imc_0/cas_count_write/ -I 1000 ...
195
196       Each memory controller has its own PMU. Measuring the complete system
197       bandwidth would require specifying all imc PMUs (see perf list output),
198       and adding the values together. To simplify creation of multiple
199       events, prefix and glob matching is supported in the PMU name, and the
200       prefix uncore_ is also ignored when performing the match. So the
201       command above can be expanded to all memory controllers by using the
202       syntaxes:
203
204           perf stat -C 0 -a imc/cas_count_read/,imc/cas_count_write/ -I 1000 ...
205           perf stat -C 0 -a *imc*/cas_count_read/,*imc*/cas_count_write/ -I 1000 ...
206
207       This example measures the combined core power every second
208
209           perf stat -I 1000 -e power/energy-cores/  -a
210

ACCESS RESTRICTIONS

212       For non root users generally only context switched PMU events are
213       available. This is normally only the events in the cpu PMU, the
214       predefined events like cycles and instructions and some software
215       events.
216
217       Other PMUs and global measurements are normally root only. Some event
218       qualifiers, such as "any", are also root only.
219
220       This can be overridden by setting the kernel.perf_event_paranoid sysctl
221       to -1, which allows non root to use these events.
222
223       For accessing trace point events perf needs to have read access to
224       /sys/kernel/tracing, even when perf_event_paranoid is in a relaxed
225       setting.
226

TRACING

228       Some PMUs control advanced hardware tracing capabilities, such as Intel
229       PT, that allows low overhead execution tracing. These are described in
230       a separate intel-pt.txt document.
231

PARAMETERIZED EVENTS

233       Some pmu events listed by perf-list will be displayed with ? in them.
234       For example:
235
236           hv_gpci/dtbp_ptitc,phys_processor_idx=?/
237
238       This means that when provided as an event, a value for ? must also be
239       supplied. For example:
240
241           perf stat -C 0 -e 'hv_gpci/dtbp_ptitc,phys_processor_idx=0x2/' ...
242
243       EVENT QUALIFIERS:
244
245       It is also possible to add extra qualifiers to an event:
246
247       percore:
248
249       Sums up the event counts for all hardware threads in a core, e.g.:
250
251           perf stat -e cpu/event=0,umask=0x3,percore=1/
252

EVENT GROUPS

254       Perf supports time based multiplexing of events, when the number of
255       events active exceeds the number of hardware performance counters.
256       Multiplexing can cause measurement errors when the workload changes its
257       execution profile.
258
259       When metrics are computed using formulas from event counts, it is
260       useful to ensure some events are always measured together as a group to
261       minimize multiplexing errors. Event groups can be specified using { }.
262
263           perf stat -e '{instructions,cycles}' ...
264
265       The number of available performance counters depend on the CPU. A group
266       cannot contain more events than available counters. For example Intel
267       Core CPUs typically have four generic performance counters for the
268       core, plus three fixed counters for instructions, cycles and
269       ref-cycles. Some special events have restrictions on which counter they
270       can schedule, and may not support multiple instances in a single group.
271       When too many events are specified in the group some of them will not
272       be measured.
273
274       Globally pinned events can limit the number of counters available for
275       other groups. On x86 systems, the NMI watchdog pins a counter by
276       default. The nmi watchdog can be disabled as root with
277
278           echo 0 > /proc/sys/kernel/nmi_watchdog
279
280       Events from multiple different PMUs cannot be mixed in a group, with
281       some exceptions for software events.
282

LEADER SAMPLING

284       perf also supports group leader sampling using the :S specifier.
285
286           perf record -e '{cycles,instructions}:S' ...
287           perf report --group
288
289       Normally all events in an event group sample, but with :S only the
290       first event (the leader) samples, and it only reads the values of the
291       other events in the group.
292
293       However, in the case AUX area events (e.g. Intel PT or CoreSight), the
294       AUX area event must be the leader, so then the second event samples,
295       not the first.
296

OPTIONS

298       Without options all known events will be listed.
299
300       To limit the list use:
301
302        1. hw or hardware to list hardware events such as cache-misses, etc.
303
304        2. sw or software to list software events such as context switches,
305           etc.
306
307        3. cache or hwcache to list hardware cache events such as
308           L1-dcache-loads, etc.
309
310        4. tracepoint to list all tracepoint events, alternatively use
311           subsys_glob:event_glob to filter by tracepoint subsystems such as
312           sched, block, etc.
313
314        5. pmu to print the kernel supplied PMU events.
315
316        6. sdt to list all Statically Defined Tracepoint events.
317
318        7. metric to list metrics
319
320        8. metricgroup to list metricgroups with metrics.
321
322        9. If none of the above is matched, it will apply the supplied glob to
323           all events, printing the ones that match.
324
325       10. As a last resort, it will do a substring search in all event names.
326
327       One or more types can be used at the same time, listing the events for
328       the types specified.
329
330       Support raw format:
331
332        1. --raw-dump, shows the raw-dump of all the events.
333
334        2. --raw-dump [hw|sw|cache|tracepoint|pmu|event_glob], shows the
335           raw-dump of a certain kind of events.
336

NOTES

343        1. Intel® 64 and IA-32 Architectures Software Developer’s Manual
344           Volume 3B: System Programming Guide
345           http://www.intel.com/sdm/
346
347        2. AMD Processor Programming Reference (PPR)
348           https://bugzilla.kernel.org/show_bug.cgi?id=206537
349
350
351
352perf                              11/28/2023                      PERF-LIST(1)