1LIBPFM(3) Linux Programmer's Manual LIBPFM(3)
2
3
4
6 libpfm_nehalem - support for Intel Nehalem processor family
7
9 #include <perfmon/pfmlib.h>
10 #include <perfmon/pfmlib_intel_nhm.h>
11
12
14 The libpfm library provides full support for the Intel Nehalem proces‐
15 sor family, such as Intel Core i7. The interface is defined in pfm‐
16 lib_intel_nhm.h. It consists of a set of functions and structures
17 describing the Intel Nehalem processor specific PMU features. The
18 Intel Nehalem processor is a quad core, dual thread processor. It
19 includes two types of PMU: core and uncore. The latter measures events
20 at the socket level and is therefore disconnected from any of the four
21 cores. The core PMU implements Intel architectural perfmon version 3
22 with four generic counters and three fixed counters. The uncore has
23 eight generic counters and one fixed counter. Each Intel Nehalem core
24 also implement a 16-deep branch trace buffer, called Last Branch Record
25 (LBR), which can be used in combination with the core PMU. Intel
26 Nehalem implements a newer version of the Precise Event-Based Sampling
27 (PEBS) mechanism which has the ability to capture where cache misses
28 occur.
29
30
31 When Intel Nehalem processor specific features are needed to support a
32 measurement, their descriptions must be passed as model-specific input
33 arguments to the pfm_dispatch_events() function. The Intel Nehalem pro‐
34 cessors specific input arguments are described in the pfm‐
35 lib_nhm_input_param_t structure. No output parameters are currently
36 defined. The input parameters are defined as follows:
37
38 typedef struct {
39 unsigned long cnt_mask;
40 unsigned int flags;
41 } pfmlib_nhm_counter_t;
42
43 typedef struct {
44 unsigned int lbr_used;
45 unsigned int lbr_plm;
46 unsigned int lbr_filter;
47 } pfmlib_nhm_lbr_t;
48
49 typedef struct {
50 unsigned int pebs_used;
51 unsigned int ld_lat_thres;
52 } pfmlib_nhm_pebs_t;
53
54 typedef struct {
55 pfmlib_nhm_counter_t pfp_nhm_counters[PMU_NHM_NUM_COUNTERS];
56 pfmlib_nhm_pebs_t pfp_nhm_pebs;
57 pfmlib_nhm_lbr_t pfm_nhm_lbr;
58 uint64_t reserved[4];
59 } pfmlib_nhm_input_param_t;
60
61
62 The Intel Nehalem processor provides a few additional per-event fea‐
63 tures for counters: thresholding, inversion, edge detection, monitoring
64 of both threads, occupancy. They can be set using the pfp_nhm_counters
65 data structure for each event. The flags field can be initialized with
66 the following values, depending on the event:
67
68 PFMLIB_NHM_SEL_INV
69 Inverse the results of the cnt_mask comparison when set. This
70 flag is supported for core and uncore PMU events.
71
72 PFMLIB_NHM_SEL_EDGE
73 Enables edge detection of events. This flag is supported for
74 core and uncore PMU events.
75
76 PFMLIB_NHM_SEL_ANYTHR
77 Enable measuring the event in any of the two processor threads
78 assuming hyper-threading is enabled. By default, only the cur‐
79 rent thread is measured. This flag is restricted to core PMU
80 events.
81
82 PFMLIB_NHM_SEL_OCC_RST
83 When set, the queue occupancy counter associated with the event
84 is cleared. This flag is only available to uncore PMU events.
85
86 The cnt_mask field is used to set the event threshold. The value of
87 the counter is incremented for each cycle in which the number of occur‐
88 rences of the event is greater or equal to the value of the field.
89 Thus, the event is modified to actually measure the number of qualify‐
90 ing cycles. When zero all occurrences are counted (this is the
91 default). This flag is supported for core and uncore PMU events.
92
93
95 The library can be used to setup the PMC registers associated with
96 PEBS. In this case, the pfp_nhm_pebs_t structure must be used and the
97 pebs_used field must be set to 1.
98
99 To enable the PEBS load latency filtering capability, it is necessary
100 to program the MEM_INST_RETIRED:LATENCY_ABOVE_THRESHOLD event into one
101 generic counter. The latency threshold must be passed to the library in
102 the ld_lat_thres field. It is expressed in core cycles and must
103 greater than 3. Note that pebs_used must be set as well.
104
105
107 The library can be used to setup LBR registers. On Intel Nehalem pro‐
108 cessors, the LBR is 16-entry deep and it is possible to filter
109 branches, based on privilege level or type. To configure the LBR, the
110 pfm_nhm_lbr_t structure must be used.
111
112 Like core PMU counters, LBR only distinguishes two privilege levels, 0
113 and the rest (1,2,3). When running Linux natively, the kernel is at
114 privilege level 0, applications at level 3. It is possible to specify
115 the privilege level of LBR using the lbr_plm. Any attempt to pass
116 PFM_PLM1 or PFM_PLM2 will be rejected. If _plm is 0, then the global
117 value in pfmlib_input_param_t and the pfp_dfl_plm is used.
118
119 By default, LBR captures all branches. It is possible to filter out
120 branches by passing a set of flags in lbr_select. The flags are as fol‐
121 lows:
122
123 PFMLIB_NHM_LBR_JCC
124 When set, LBR does not capture conditional branches. Default:
125 off.
126
127 PFM_NHM_LBR_NEAR_REL_CALL
128 When set, LBR does not capture near calls. Default: off.
129
130 PFM_NHM_LBR_NEAR_IND_CALL
131 When set, LBR does not capture indirect calls. Default: off.
132
133 PFM_NHM_LBR_NEAR_RET
134 When set, LBR does not capture return branches. Default: off.
135
136 PFM_NHM_LBR_NEAR_IND_JMP
137 When set, LBR does not capture indirect branches. Default: off.
138
139 PFM_NHM_LBR_NEAR_REL_JMP
140 When set, LBR does not capture relative branches. Default: off.
141
142 PFM_NHM_LBR_FAR_BRANCH
143 When set, LBR does not capture far branches. Default: off.
144
145
147 By nature, the uncore PMU does not distinguish privilege levels, there‐
148 fore it captures events at all privilege levels. To avoid any misinter‐
149 pretation, the library enforces that uncore events be measured with
150 both PFM_PLM0 and PFM_PLM3 set.
151
152 Tools and operating system kernel interfaces may impose further
153 restrictions on how the uncore PMU can be accessed.
154
155
157 pfm_dispatch_events(3) and set of examples shipped with the library
158
160 Stephane Eranian <eranian@gmail.com>
161
162 January, 2009 LIBPFM(3)