1PERF-LIST(1) perf Manual PERF-LIST(1)
2
3
4
6 perf-list - List all symbolic event types
7
9 perf list [--no-desc] [--long-desc]
10 [hw|sw|cache|tracepoint|pmu|sdt|metric|metricgroup|event_glob]
11
13 This command displays the symbolic event types which can be selected in
14 the various perf commands with the -e option.
15
17 --no-desc
18 Don’t print descriptions.
19
20 -v, --long-desc
21 Print longer event descriptions.
22
23 --details
24 Print how named events are resolved internally into perf events,
25 and also any extra expressions computed by perf stat.
26
28 Events can optionally have a modifier by appending a colon and one or
29 more modifiers. Modifiers allow the user to restrict the events to be
30 counted. The following modifiers exist:
31
32 u - user-space counting
33 k - kernel counting
34 h - hypervisor counting
35 I - non idle counting
36 G - guest counting (in KVM guests)
37 H - host counting (not in KVM guests)
38 p - precise level
39 P - use maximum detected precise level
40 S - read sample value (PERF_SAMPLE_READ)
41 D - pin the event to the PMU
42 W - group is weak and will fallback to non-group if not schedulable,
43 only supported in 'perf stat' for now.
44
45 The p modifier can be used for specifying how precise the instruction
46 address should be. The p modifier can be specified multiple times:
47
48 0 - SAMPLE_IP can have arbitrary skid
49 1 - SAMPLE_IP must have constant skid
50 2 - SAMPLE_IP requested to have 0 skid
51 3 - SAMPLE_IP must have 0 skid, or uses randomization to avoid
52 sample shadowing effects.
53
54 For Intel systems precise event sampling is implemented with PEBS which
55 supports up to precise-level 2, and precise level 3 for some special
56 cases
57
58 On AMD systems it is implemented using IBS (up to precise-level 2). The
59 precise modifier works with event types 0x76 (cpu-cycles, CPU clocks
60 not halted) and 0xC1 (micro-ops retired). Both events map to IBS
61 execution sampling (IBS op) with the IBS Op Counter Control bit
62 (IbsOpCntCtl) set respectively (see AMD64 Architecture Programmer’s
63 Manual Volume 2: System Programming, 13.3 Instruction-Based Sampling).
64 Examples to use IBS:
65
66 perf record -a -e cpu-cycles:p ... # use ibs op counting cycles
67 perf record -a -e r076:p ... # same as -e cpu-cycles:p
68 perf record -a -e r0C1:p ... # use ibs op counting micro-ops
69
71 Even when an event is not available in a symbolic form within perf
72 right now, it can be encoded in a per processor specific way.
73
74 For instance For x86 CPUs NNN represents the raw register encoding with
75 the layout of IA32_PERFEVTSELx MSRs (see [Intel® 64 and IA-32
76 Architectures Software Developer’s Manual Volume 3B: System Programming
77 Guide] Figure 30-1 Layout of IA32_PERFEVTSELx MSRs) or AMD’s
78 PerfEvtSeln (see [AMD64 Architecture Programmer’s Manual Volume 2:
79 System Programming], Page 344, Figure 13-7 Performance Event-Select
80 Register (PerfEvtSeln)).
81
82 Note: Only the following bit fields can be set in x86 counter
83 registers: event, umask, edge, inv, cmask. Esp. guest/host only and
84 OS/user mode flags must be setup using EVENT MODIFIERS.
85
86 Example:
87
88 If the Intel docs for a QM720 Core i7 describe an event as:
89
90 Event Umask Event Mask
91 Num. Value Mnemonic Description Comment
92
93 A8H 01H LSD.UOPS Counts the number of micro-ops Use cmask=1 and
94 delivered by loop stream detector invert to count
95 cycles
96
97 raw encoding of 0x1A8 can be used:
98
99 perf stat -e r1a8 -a sleep 1
100 perf record -e r1a8 ...
101
102 You should refer to the processor specific documentation for getting
103 these details. Some of them are referenced in the SEE ALSO section
104 below.
105
107 perf also supports an extended syntax for specifying raw parameters to
108 PMUs. Using this typically requires looking up the specific event in
109 the CPU vendor specific documentation.
110
111 The available PMUs and their raw parameters can be listed with
112
113 ls /sys/devices/*/format
114
115 For example the raw event "LSD.UOPS" core pmu event above could be
116 specified as
117
118 perf stat -e cpu/event=0xa8,umask=0x1,name=LSD.UOPS_CYCLES,cmask=1/ ...
119
121 Some PMUs are not associated with a core, but with a whole CPU socket.
122 Events on these PMUs generally cannot be sampled, but only counted
123 globally with perf stat -a. They can be bound to one logical CPU, but
124 will measure all the CPUs in the same socket.
125
126 This example measures memory bandwidth every second on the first memory
127 controller on socket 0 of a Intel Xeon system
128
129 perf stat -C 0 -a uncore_imc_0/cas_count_read/,uncore_imc_0/cas_count_write/ -I 1000 ...
130
131 Each memory controller has its own PMU. Measuring the complete system
132 bandwidth would require specifying all imc PMUs (see perf list output),
133 and adding the values together.
134
135 This example measures the combined core power every second
136
137 perf stat -I 1000 -e power/energy-cores/ -a
138
140 For non root users generally only context switched PMU events are
141 available. This is normally only the events in the cpu PMU, the
142 predefined events like cycles and instructions and some software
143 events.
144
145 Other PMUs and global measurements are normally root only. Some event
146 qualifiers, such as "any", are also root only.
147
148 This can be overriden by setting the kernel.perf_event_paranoid sysctl
149 to -1, which allows non root to use these events.
150
151 For accessing trace point events perf needs to have read access to
152 /sys/kernel/debug/tracing, even when perf_event_paranoid is in a
153 relaxed setting.
154
156 Some PMUs control advanced hardware tracing capabilities, such as Intel
157 PT, that allows low overhead execution tracing. These are described in
158 a separate intel-pt.txt document.
159
161 Some pmu events listed by perf-list will be displayed with ? in them.
162 For example:
163
164 hv_gpci/dtbp_ptitc,phys_processor_idx=?/
165
166 This means that when provided as an event, a value for ? must also be
167 supplied. For example:
168
169 perf stat -C 0 -e 'hv_gpci/dtbp_ptitc,phys_processor_idx=0x2/' ...
170
172 Perf supports time based multiplexing of events, when the number of
173 events active exceeds the number of hardware performance counters.
174 Multiplexing can cause measurement errors when the workload changes its
175 execution profile.
176
177 When metrics are computed using formulas from event counts, it is
178 useful to ensure some events are always measured together as a group to
179 minimize multiplexing errors. Event groups can be specified using { }.
180
181 perf stat -e '{instructions,cycles}' ...
182
183 The number of available performance counters depend on the CPU. A group
184 cannot contain more events than available counters. For example Intel
185 Core CPUs typically have four generic performance counters for the
186 core, plus three fixed counters for instructions, cycles and
187 ref-cycles. Some special events have restrictions on which counter they
188 can schedule, and may not support multiple instances in a single group.
189 When too many events are specified in the group some of them will not
190 be measured.
191
192 Globally pinned events can limit the number of counters available for
193 other groups. On x86 systems, the NMI watchdog pins a counter by
194 default. The nmi watchdog can be disabled as root with
195
196 echo 0 > /proc/sys/kernel/nmi_watchdog
197
198 Events from multiple different PMUs cannot be mixed in a group, with
199 some exceptions for software events.
200
202 perf also supports group leader sampling using the :S specifier.
203
204 perf record -e '{cycles,instructions}:S' ...
205 perf report --group
206
207 Normally all events in a event group sample, but with :S only the first
208 event (the leader) samples, and it only reads the values of the other
209 events in the group.
210
212 Without options all known events will be listed.
213
214 To limit the list use:
215
216 1. hw or hardware to list hardware events such as cache-misses, etc.
217
218 2. sw or software to list software events such as context switches,
219 etc.
220
221 3. cache or hwcache to list hardware cache events such as
222 L1-dcache-loads, etc.
223
224 4. tracepoint to list all tracepoint events, alternatively use
225 subsys_glob:event_glob to filter by tracepoint subsystems such as
226 sched, block, etc.
227
228 5. pmu to print the kernel supplied PMU events.
229
230 6. sdt to list all Statically Defined Tracepoint events.
231
232 7. metric to list metrics
233
234 8. metricgroup to list metricgroups with metrics.
235
236 9. If none of the above is matched, it will apply the supplied glob to
237 all events, printing the ones that match.
238
239 10. As a last resort, it will do a substring search in all event names.
240
241 One or more types can be used at the same time, listing the events for
242 the types specified.
243
244 Support raw format:
245
246 1. --raw-dump, shows the raw-dump of all the events.
247
248 2. --raw-dump [hw|sw|cache|tracepoint|pmu|event_glob], shows the
249 raw-dump of a certain kind of events.
250
252 perf-stat(1), perf-top(1), perf-record(1), Intel® 64 and IA-32
253 Architectures Software Developer’s Manual Volume 3B: System Programming
254 Guide[1], AMD64 Architecture Programmer’s Manual Volume 2: System
255 Programming[2]
256
258 1. Intel® 64 and IA-32 Architectures Software Developer’s Manual
259 Volume 3B: System Programming Guide
260 http://www.intel.com/sdm/
261
262 2. AMD64 Architecture Programmer’s Manual Volume 2: System Programming
263 http://support.amd.com/us/Processor_TechDocs/24593_APM_v2.pdf
264
265
266
267perf 06/18/2019 PERF-LIST(1)