1PERF-LIST(1) perf Manual PERF-LIST(1)
2
3
4
6 perf-list - List all symbolic event types
7
9 perf list [--no-desc] [--long-desc]
10 [hw|sw|cache|tracepoint|pmu|sdt|metric|metricgroup|event_glob]
11
13 This command displays the symbolic event types which can be selected in
14 the various perf commands with the -e option.
15
17 -d, --desc
18 Print extra event descriptions. (default)
19
20 --no-desc
21 Don’t print descriptions.
22
23 -v, --long-desc
24 Print longer event descriptions.
25
26 --debug
27 Enable debugging output.
28
29 --details
30 Print how named events are resolved internally into perf events,
31 and also any extra expressions computed by perf stat.
32
33 --deprecated
34 Print deprecated events. By default the deprecated events are
35 hidden.
36
37 --unit
38 Print PMU events and metrics limited to the specific PMU name.
39 (e.g. --unit cpu, --unit msr, --unit cpu_core, --unit cpu_atom)
40
41 -j, --json
42 Output in JSON format.
43
45 Events can optionally have a modifier by appending a colon and one or
46 more modifiers. Modifiers allow the user to restrict the events to be
47 counted. The following modifiers exist:
48
49 u - user-space counting
50 k - kernel counting
51 h - hypervisor counting
52 I - non idle counting
53 G - guest counting (in KVM guests)
54 H - host counting (not in KVM guests)
55 p - precise level
56 P - use maximum detected precise level
57 S - read sample value (PERF_SAMPLE_READ)
58 D - pin the event to the PMU
59 W - group is weak and will fallback to non-group if not schedulable,
60 e - group or event are exclusive and do not share the PMU
61
62 The p modifier can be used for specifying how precise the instruction
63 address should be. The p modifier can be specified multiple times:
64
65 0 - SAMPLE_IP can have arbitrary skid
66 1 - SAMPLE_IP must have constant skid
67 2 - SAMPLE_IP requested to have 0 skid
68 3 - SAMPLE_IP must have 0 skid, or uses randomization to avoid
69 sample shadowing effects.
70
71 For Intel systems precise event sampling is implemented with PEBS which
72 supports up to precise-level 2, and precise level 3 for some special
73 cases
74
75 On AMD systems it is implemented using IBS (up to precise-level 2). The
76 precise modifier works with event types 0x76 (cpu-cycles, CPU clocks
77 not halted) and 0xC1 (micro-ops retired). Both events map to IBS
78 execution sampling (IBS op) with the IBS Op Counter Control bit
79 (IbsOpCntCtl) set respectively (see the Core Complex (CCX) → Processor
80 x86 Core → Instruction Based Sampling (IBS) section of the [AMD
81 Processor Programming Reference (PPR)] relevant to the family, model
82 and stepping of the processor being used).
83
84 Manual Volume 2: System Programming, 13.3 Instruction-Based Sampling).
85 Examples to use IBS:
86
87 perf record -a -e cpu-cycles:p ... # use ibs op counting cycles
88 perf record -a -e r076:p ... # same as -e cpu-cycles:p
89 perf record -a -e r0C1:p ... # use ibs op counting micro-ops
90
92 Even when an event is not available in a symbolic form within perf
93 right now, it can be encoded in a per processor specific way.
94
95 For instance on x86 CPUs, N is a hexadecimal value that represents the
96 raw register encoding with the layout of IA32_PERFEVTSELx MSRs (see
97 [Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume
98 3B: System Programming Guide] Figure 30-1 Layout of IA32_PERFEVTSELx
99 MSRs) or AMD’s PERF_CTL MSRs (see the Core Complex (CCX) → Processor
100 x86 Core → MSR Registers section of the [AMD Processor Programming
101 Reference (PPR)] relevant to the family, model and stepping of the
102 processor being used).
103
104 Note: Only the following bit fields can be set in x86 counter
105 registers: event, umask, edge, inv, cmask. Esp. guest/host only and
106 OS/user mode flags must be setup using EVENT MODIFIERS.
107
108 Example:
109
110 If the Intel docs for a QM720 Core i7 describe an event as:
111
112 Event Umask Event Mask
113 Num. Value Mnemonic Description Comment
114
115 A8H 01H LSD.UOPS Counts the number of micro-ops Use cmask=1 and
116 delivered by loop stream detector invert to count
117 cycles
118
119 raw encoding of 0x1A8 can be used:
120
121 perf stat -e r1a8 -a sleep 1
122 perf record -e r1a8 ...
123
124 It’s also possible to use pmu syntax:
125
126 perf record -e r1a8 -a sleep 1
127 perf record -e cpu/r1a8/ ...
128 perf record -e cpu/r0x1a8/ ...
129
130 Some processors, like those from AMD, support event codes and unit
131 masks larger than a byte. In such cases, the bits corresponding to the
132 event configuration parameters can be seen with:
133
134 cat /sys/bus/event_source/devices/<pmu>/format/<config>
135
136 Example:
137
138 If the AMD docs for an EPYC 7713 processor describe an event as:
139
140 Event Umask Event Mask
141 Num. Value Mnemonic Description
142
143 28FH 03H op_cache_hit_miss.op_cache_hit Counts Op Cache micro-tag
144 hit events.
145
146 raw encoding of 0x0328F cannot be used since the upper nibble of the
147 EventSelect bits have to be specified via bits 32-35 as can be seen
148 with:
149
150 cat /sys/bus/event_source/devices/cpu/format/event
151
152 raw encoding of 0x20000038F should be used instead:
153
154 perf stat -e r20000038f -a sleep 1
155 perf record -e r20000038f ...
156
157 It’s also possible to use pmu syntax:
158
159 perf record -e r20000038f -a sleep 1
160 perf record -e cpu/r20000038f/ ...
161 perf record -e cpu/r0x20000038f/ ...
162
163 You should refer to the processor specific documentation for getting
164 these details. Some of them are referenced in the SEE ALSO section
165 below.
166
168 perf also supports an extended syntax for specifying raw parameters to
169 PMUs. Using this typically requires looking up the specific event in
170 the CPU vendor specific documentation.
171
172 The available PMUs and their raw parameters can be listed with
173
174 ls /sys/devices/*/format
175
176 For example the raw event "LSD.UOPS" core pmu event above could be
177 specified as
178
179 perf stat -e cpu/event=0xa8,umask=0x1,name=LSD.UOPS_CYCLES,cmask=0x1/ ...
180
181 or using extended name syntax
182
183 perf stat -e cpu/event=0xa8,umask=0x1,cmask=0x1,name=\'LSD.UOPS_CYCLES:cmask=0x1\'/ ...
184
186 Some PMUs are not associated with a core, but with a whole CPU socket.
187 Events on these PMUs generally cannot be sampled, but only counted
188 globally with perf stat -a. They can be bound to one logical CPU, but
189 will measure all the CPUs in the same socket.
190
191 This example measures memory bandwidth every second on the first memory
192 controller on socket 0 of a Intel Xeon system
193
194 perf stat -C 0 -a uncore_imc_0/cas_count_read/,uncore_imc_0/cas_count_write/ -I 1000 ...
195
196 Each memory controller has its own PMU. Measuring the complete system
197 bandwidth would require specifying all imc PMUs (see perf list output),
198 and adding the values together. To simplify creation of multiple
199 events, prefix and glob matching is supported in the PMU name, and the
200 prefix uncore_ is also ignored when performing the match. So the
201 command above can be expanded to all memory controllers by using the
202 syntaxes:
203
204 perf stat -C 0 -a imc/cas_count_read/,imc/cas_count_write/ -I 1000 ...
205 perf stat -C 0 -a *imc*/cas_count_read/,*imc*/cas_count_write/ -I 1000 ...
206
207 This example measures the combined core power every second
208
209 perf stat -I 1000 -e power/energy-cores/ -a
210
212 For non root users generally only context switched PMU events are
213 available. This is normally only the events in the cpu PMU, the
214 predefined events like cycles and instructions and some software
215 events.
216
217 Other PMUs and global measurements are normally root only. Some event
218 qualifiers, such as "any", are also root only.
219
220 This can be overridden by setting the kernel.perf_event_paranoid sysctl
221 to -1, which allows non root to use these events.
222
223 For accessing trace point events perf needs to have read access to
224 /sys/kernel/tracing, even when perf_event_paranoid is in a relaxed
225 setting.
226
228 Some PMUs control advanced hardware tracing capabilities, such as Intel
229 PT, that allows low overhead execution tracing. These are described in
230 a separate intel-pt.txt document.
231
233 Some pmu events listed by perf-list will be displayed with ? in them.
234 For example:
235
236 hv_gpci/dtbp_ptitc,phys_processor_idx=?/
237
238 This means that when provided as an event, a value for ? must also be
239 supplied. For example:
240
241 perf stat -C 0 -e 'hv_gpci/dtbp_ptitc,phys_processor_idx=0x2/' ...
242
243 EVENT QUALIFIERS:
244
245 It is also possible to add extra qualifiers to an event:
246
247 percore:
248
249 Sums up the event counts for all hardware threads in a core, e.g.:
250
251 perf stat -e cpu/event=0,umask=0x3,percore=1/
252
254 Perf supports time based multiplexing of events, when the number of
255 events active exceeds the number of hardware performance counters.
256 Multiplexing can cause measurement errors when the workload changes its
257 execution profile.
258
259 When metrics are computed using formulas from event counts, it is
260 useful to ensure some events are always measured together as a group to
261 minimize multiplexing errors. Event groups can be specified using { }.
262
263 perf stat -e '{instructions,cycles}' ...
264
265 The number of available performance counters depend on the CPU. A group
266 cannot contain more events than available counters. For example Intel
267 Core CPUs typically have four generic performance counters for the
268 core, plus three fixed counters for instructions, cycles and
269 ref-cycles. Some special events have restrictions on which counter they
270 can schedule, and may not support multiple instances in a single group.
271 When too many events are specified in the group some of them will not
272 be measured.
273
274 Globally pinned events can limit the number of counters available for
275 other groups. On x86 systems, the NMI watchdog pins a counter by
276 default. The nmi watchdog can be disabled as root with
277
278 echo 0 > /proc/sys/kernel/nmi_watchdog
279
280 Events from multiple different PMUs cannot be mixed in a group, with
281 some exceptions for software events.
282
284 perf also supports group leader sampling using the :S specifier.
285
286 perf record -e '{cycles,instructions}:S' ...
287 perf report --group
288
289 Normally all events in an event group sample, but with :S only the
290 first event (the leader) samples, and it only reads the values of the
291 other events in the group.
292
293 However, in the case AUX area events (e.g. Intel PT or CoreSight), the
294 AUX area event must be the leader, so then the second event samples,
295 not the first.
296
298 Without options all known events will be listed.
299
300 To limit the list use:
301
302 1. hw or hardware to list hardware events such as cache-misses, etc.
303
304 2. sw or software to list software events such as context switches,
305 etc.
306
307 3. cache or hwcache to list hardware cache events such as
308 L1-dcache-loads, etc.
309
310 4. tracepoint to list all tracepoint events, alternatively use
311 subsys_glob:event_glob to filter by tracepoint subsystems such as
312 sched, block, etc.
313
314 5. pmu to print the kernel supplied PMU events.
315
316 6. sdt to list all Statically Defined Tracepoint events.
317
318 7. metric to list metrics
319
320 8. metricgroup to list metricgroups with metrics.
321
322 9. If none of the above is matched, it will apply the supplied glob to
323 all events, printing the ones that match.
324
325 10. As a last resort, it will do a substring search in all event names.
326
327 One or more types can be used at the same time, listing the events for
328 the types specified.
329
330 Support raw format:
331
332 1. --raw-dump, shows the raw-dump of all the events.
333
334 2. --raw-dump [hw|sw|cache|tracepoint|pmu|event_glob], shows the
335 raw-dump of a certain kind of events.
336
338 perf-stat(1), perf-top(1), perf-record(1), Intel® 64 and IA-32
339 Architectures Software Developer’s Manual Volume 3B: System Programming
340 Guide[1], AMD Processor Programming Reference (PPR)[2]
341
343 1. Intel® 64 and IA-32 Architectures Software Developer’s Manual
344 Volume 3B: System Programming Guide
345 http://www.intel.com/sdm/
346
347 2. AMD Processor Programming Reference (PPR)
348 https://bugzilla.kernel.org/show_bug.cgi?id=206537
349
350
351
352perf 11/28/2023 PERF-LIST(1)