1PERF-LIST(1)                      perf Manual                     PERF-LIST(1)
2
3
4

NAME

6       perf-list - List all symbolic event types
7

SYNOPSIS

9       perf list [--no-desc] [--long-desc]
10                   [hw|sw|cache|tracepoint|pmu|sdt|metric|metricgroup|event_glob]
11

DESCRIPTION

13       This command displays the symbolic event types which can be selected in
14       the various perf commands with the -e option.
15

OPTIONS

17       --no-desc
18           Don’t print descriptions.
19
20       -v, --long-desc
21           Print longer event descriptions.
22
23       --details
24           Print how named events are resolved internally into perf events,
25           and also any extra expressions computed by perf stat.
26

EVENT MODIFIERS

28       Events can optionally have a modifier by appending a colon and one or
29       more modifiers. Modifiers allow the user to restrict the events to be
30       counted. The following modifiers exist:
31
32           u - user-space counting
33           k - kernel counting
34           h - hypervisor counting
35           I - non idle counting
36           G - guest counting (in KVM guests)
37           H - host counting (not in KVM guests)
38           p - precise level
39           P - use maximum detected precise level
40           S - read sample value (PERF_SAMPLE_READ)
41           D - pin the event to the PMU
42           W - group is weak and will fallback to non-group if not schedulable,
43               only supported in 'perf stat' for now.
44
45       The p modifier can be used for specifying how precise the instruction
46       address should be. The p modifier can be specified multiple times:
47
48           0 - SAMPLE_IP can have arbitrary skid
49           1 - SAMPLE_IP must have constant skid
50           2 - SAMPLE_IP requested to have 0 skid
51           3 - SAMPLE_IP must have 0 skid, or uses randomization to avoid
52               sample shadowing effects.
53
54       For Intel systems precise event sampling is implemented with PEBS which
55       supports up to precise-level 2, and precise level 3 for some special
56       cases
57
58       On AMD systems it is implemented using IBS (up to precise-level 2). The
59       precise modifier works with event types 0x76 (cpu-cycles, CPU clocks
60       not halted) and 0xC1 (micro-ops retired). Both events map to IBS
61       execution sampling (IBS op) with the IBS Op Counter Control bit
62       (IbsOpCntCtl) set respectively (see AMD64 Architecture Programmer’s
63       Manual Volume 2: System Programming, 13.3 Instruction-Based Sampling).
64       Examples to use IBS:
65
66           perf record -a -e cpu-cycles:p ...    # use ibs op counting cycles
67           perf record -a -e r076:p ...          # same as -e cpu-cycles:p
68           perf record -a -e r0C1:p ...          # use ibs op counting micro-ops
69

RAW HARDWARE EVENT DESCRIPTOR

71       Even when an event is not available in a symbolic form within perf
72       right now, it can be encoded in a per processor specific way.
73
74       For instance For x86 CPUs NNN represents the raw register encoding with
75       the layout of IA32_PERFEVTSELx MSRs (see [Intel® 64 and IA-32
76       Architectures Software Developer’s Manual Volume 3B: System Programming
77       Guide] Figure 30-1 Layout of IA32_PERFEVTSELx MSRs) or AMD’s
78       PerfEvtSeln (see [AMD64 Architecture Programmer’s Manual Volume 2:
79       System Programming], Page 344, Figure 13-7 Performance Event-Select
80       Register (PerfEvtSeln)).
81
82       Note: Only the following bit fields can be set in x86 counter
83       registers: event, umask, edge, inv, cmask. Esp. guest/host only and
84       OS/user mode flags must be setup using EVENT MODIFIERS.
85
86       Example:
87
88       If the Intel docs for a QM720 Core i7 describe an event as:
89
90           Event  Umask  Event Mask
91           Num.   Value  Mnemonic    Description                        Comment
92
93           A8H      01H  LSD.UOPS    Counts the number of micro-ops     Use cmask=1 and
94                                     delivered by loop stream detector  invert to count
95                                                                        cycles
96
97       raw encoding of 0x1A8 can be used:
98
99           perf stat -e r1a8 -a sleep 1
100           perf record -e r1a8 ...
101
102       You should refer to the processor specific documentation for getting
103       these details. Some of them are referenced in the SEE ALSO section
104       below.
105

ARBITRARY PMUS

107       perf also supports an extended syntax for specifying raw parameters to
108       PMUs. Using this typically requires looking up the specific event in
109       the CPU vendor specific documentation.
110
111       The available PMUs and their raw parameters can be listed with
112
113           ls /sys/devices/*/format
114
115       For example the raw event "LSD.UOPS" core pmu event above could be
116       specified as
117
118           perf stat -e cpu/event=0xa8,umask=0x1,name=LSD.UOPS_CYCLES,cmask=0x1/ ...
119
120           or using extended name syntax
121
122           perf stat -e cpu/event=0xa8,umask=0x1,cmask=0x1,name=\'LSD.UOPS_CYCLES:cmask=0x1\'/ ...
123

PER SOCKET PMUS

125       Some PMUs are not associated with a core, but with a whole CPU socket.
126       Events on these PMUs generally cannot be sampled, but only counted
127       globally with perf stat -a. They can be bound to one logical CPU, but
128       will measure all the CPUs in the same socket.
129
130       This example measures memory bandwidth every second on the first memory
131       controller on socket 0 of a Intel Xeon system
132
133           perf stat -C 0 -a uncore_imc_0/cas_count_read/,uncore_imc_0/cas_count_write/ -I 1000 ...
134
135       Each memory controller has its own PMU. Measuring the complete system
136       bandwidth would require specifying all imc PMUs (see perf list output),
137       and adding the values together. To simplify creation of multiple
138       events, prefix and glob matching is supported in the PMU name, and the
139       prefix uncore_ is also ignored when performing the match. So the
140       command above can be expanded to all memory controllers by using the
141       syntaxes:
142
143           perf stat -C 0 -a imc/cas_count_read/,imc/cas_count_write/ -I 1000 ...
144           perf stat -C 0 -a *imc*/cas_count_read/,*imc*/cas_count_write/ -I 1000 ...
145
146       This example measures the combined core power every second
147
148           perf stat -I 1000 -e power/energy-cores/  -a
149

ACCESS RESTRICTIONS

151       For non root users generally only context switched PMU events are
152       available. This is normally only the events in the cpu PMU, the
153       predefined events like cycles and instructions and some software
154       events.
155
156       Other PMUs and global measurements are normally root only. Some event
157       qualifiers, such as "any", are also root only.
158
159       This can be overriden by setting the kernel.perf_event_paranoid sysctl
160       to -1, which allows non root to use these events.
161
162       For accessing trace point events perf needs to have read access to
163       /sys/kernel/debug/tracing, even when perf_event_paranoid is in a
164       relaxed setting.
165

TRACING

167       Some PMUs control advanced hardware tracing capabilities, such as Intel
168       PT, that allows low overhead execution tracing. These are described in
169       a separate intel-pt.txt document.
170

PARAMETERIZED EVENTS

172       Some pmu events listed by perf-list will be displayed with ? in them.
173       For example:
174
175           hv_gpci/dtbp_ptitc,phys_processor_idx=?/
176
177       This means that when provided as an event, a value for ? must also be
178       supplied. For example:
179
180           perf stat -C 0 -e 'hv_gpci/dtbp_ptitc,phys_processor_idx=0x2/' ...
181

EVENT GROUPS

183       Perf supports time based multiplexing of events, when the number of
184       events active exceeds the number of hardware performance counters.
185       Multiplexing can cause measurement errors when the workload changes its
186       execution profile.
187
188       When metrics are computed using formulas from event counts, it is
189       useful to ensure some events are always measured together as a group to
190       minimize multiplexing errors. Event groups can be specified using { }.
191
192           perf stat -e '{instructions,cycles}' ...
193
194       The number of available performance counters depend on the CPU. A group
195       cannot contain more events than available counters. For example Intel
196       Core CPUs typically have four generic performance counters for the
197       core, plus three fixed counters for instructions, cycles and
198       ref-cycles. Some special events have restrictions on which counter they
199       can schedule, and may not support multiple instances in a single group.
200       When too many events are specified in the group some of them will not
201       be measured.
202
203       Globally pinned events can limit the number of counters available for
204       other groups. On x86 systems, the NMI watchdog pins a counter by
205       default. The nmi watchdog can be disabled as root with
206
207           echo 0 > /proc/sys/kernel/nmi_watchdog
208
209       Events from multiple different PMUs cannot be mixed in a group, with
210       some exceptions for software events.
211

LEADER SAMPLING

213       perf also supports group leader sampling using the :S specifier.
214
215           perf record -e '{cycles,instructions}:S' ...
216           perf report --group
217
218       Normally all events in a event group sample, but with :S only the first
219       event (the leader) samples, and it only reads the values of the other
220       events in the group.
221

OPTIONS

223       Without options all known events will be listed.
224
225       To limit the list use:
226
227        1. hw or hardware to list hardware events such as cache-misses, etc.
228
229        2. sw or software to list software events such as context switches,
230           etc.
231
232        3. cache or hwcache to list hardware cache events such as
233           L1-dcache-loads, etc.
234
235        4. tracepoint to list all tracepoint events, alternatively use
236           subsys_glob:event_glob to filter by tracepoint subsystems such as
237           sched, block, etc.
238
239        5. pmu to print the kernel supplied PMU events.
240
241        6. sdt to list all Statically Defined Tracepoint events.
242
243        7. metric to list metrics
244
245        8. metricgroup to list metricgroups with metrics.
246
247        9. If none of the above is matched, it will apply the supplied glob to
248           all events, printing the ones that match.
249
250       10. As a last resort, it will do a substring search in all event names.
251
252       One or more types can be used at the same time, listing the events for
253       the types specified.
254
255       Support raw format:
256
257        1. --raw-dump, shows the raw-dump of all the events.
258
259        2. --raw-dump [hw|sw|cache|tracepoint|pmu|event_glob], shows the
260           raw-dump of a certain kind of events.
261

SEE ALSO

263       perf-stat(1), perf-top(1), perf-record(1), Intel® 64 and IA-32
264       Architectures Software Developer’s Manual Volume 3B: System Programming
265       Guide[1], AMD64 Architecture Programmer’s Manual Volume 2: System
266       Programming[2]
267

NOTES

269        1. Intel® 64 and IA-32 Architectures Software Developer’s Manual
270           Volume 3B: System Programming Guide
271           http://www.intel.com/sdm/
272
273        2. AMD64 Architecture Programmer’s Manual Volume 2: System Programming
274           http://support.amd.com/us/Processor_TechDocs/24593_APM_v2.pdf
275
276
277
278perf                              09/24/2019                      PERF-LIST(1)
Impressum