1PAPI_profil(3)                       PAPI                       PAPI_profil(3)
2
3
4

NAME

6       PAPI_profil - generate a histogram of hardware counter overflows vs. PC
7       addresses
8
9

SYNOPSIS

11       C Interface
12       #include <papi.h>
13       int PAPI_profil(void * buf, unsigned bufsiz, unsigned long offset,
14                       unsigned scale, int EventSet, int EventCode,  int threshold,
15                       int flags);
16
17       Fortran Interface
18       The profiling routines have no Fortran interface.
19
20

DESCRIPTION

22       PAPI_profil() provides  hardware  event  statistics  by  profiling  the
23       occurence of specified hardware counter events. It is designed to mimic
24       the UNIX SVR4 profil call. The statistics are generated by  creating  a
25       histogram  of  hardware  counter  event  overflows  vs. program counter
26       addresses for the current process. The histogram is defined for a  spe‐
27       cific  region of program code to be profiled, and the identified region
28       is logically broken up into a set of equal size subdivisions,  each  of
29       which corresponds to a count in the histogram. With each hardware event
30       overflow, the current subdivision is identified and  its  corresponding
31       histogram  count is incremented. These counts establish a relative mea‐
32       sure of how many hardware counter events are occuring in each code sub‐
33       division.   The resulting histogram counts for a profiled region can be
34       used to identify those program addresses that generate a disproportion‐
35       ately high percentage of the event of interest.
36
37       Events  to  be  profiled  are specified with the EventSet and EventCode
38       parameters. More than one event can be simultaneously profiled by call‐
39       ing  PAPI_profil()  several times with different EventCode values. Pro‐
40       filing can be turned off for a given  event  by  calling  PAPI_profil()
41       with a threshold value of 0.
42
43

ARGUMENTS

45       *buf  --  pointer  to  a  buffer of bufsiz bytes in which the histogram
46       counts are stored in an array  of  unsigned  short,  unsigned  int,  or
47       unsigned  long  long  values,  or 'buckets'. The size of the buckets is
48       determined by values in the flags argument.
49
50       bufsiz -- the size of the histogram buffer in  bytes.  It  is  computed
51       from  the  length  of  the  code region to be profiled, the size of the
52       buckets, and the scale factor as discussed below.
53
54       offset -- the start address of  the region to be profiled.
55
56       scale -- broadly and historically speaking, a contraction  factor  that
57       indicates  how  much smaller the histogram buffer is than the region to
58       be profiled. More precisely, scale is interpreted as an unsigned 16-bit
59       fixed-point  fraction  with  the decimal point implied on the left. Its
60       value is the reciprocal of the number of addresses  in  a  subdivision,
61       per  counter  of  histogram  buffer. Below is a table of representative
62       values for scale:
63
64       ┌────────────────────────────────────────────────────────────────────────────────────────────┐
65Representative values for the scale variable
66       ├────────┬─────────┬─────────────────────────────────────────────────────────────────────────┤
67       │HEX     │ DECIMAL │ DEFININTION                                                             │
68       ├────────┼─────────┼─────────────────────────────────────────────────────────────────────────┤
69       │0x20000 │ 131072  │ Maps precisely one instruction address to a unique bucket in buf.       │
70       ├────────┼─────────┼─────────────────────────────────────────────────────────────────────────┤
71       │0x10000 │  65536  │ Maps precisely two instruction addresses to a unique bucket in buf.     │
72       ├────────┼─────────┼─────────────────────────────────────────────────────────────────────────┤
73       │ 0xFFFF │  65535  │ Maps approximately two instruction addresses to a unique bucket in buf. │
74       ├────────┼─────────┼─────────────────────────────────────────────────────────────────────────┤
75       │ 0x8000 │  32768  │ Maps every four instruction addresses to a bucket in buf.               │
76       ├────────┼─────────┼─────────────────────────────────────────────────────────────────────────┤
77       │ 0x4000 │  16384  │ Maps every eight instruction addresses to a bucket in buf.              │
78       ├────────┼─────────┼─────────────────────────────────────────────────────────────────────────┤
79       │ 0x0002 │      2  │ Maps all instruction addresses to the same bucket in buf.               │
80       ├────────┼─────────┼─────────────────────────────────────────────────────────────────────────┤
81       │ 0x0001 │      1  │ Undefined.                                                              │
82       ├────────┼─────────┼─────────────────────────────────────────────────────────────────────────┤
83       │ 0x0000 │      0  │ Undefined.                                                              │
84       └────────┴─────────┴─────────────────────────────────────────────────────────────────────────┘
85       Historically, the scale factor was introduced to allow  the  allocation
86       of buffers smaller than the code size to be profiled. Data and instruc‐
87       tion sizes were assumed to be multiples of 16-bits.  These  assumptions
88       are  no  longer necessarily true.  PAPI_profil has preserved the tradi‐
89       tional definition of scale where appropriate, but deprecated the  defi‐
90       nitions  for  0 and 1 (disable scaling) and extended the range of scale
91       to include 65536 and 131072 to allow  for  exactly  two  addresses  and
92       exactly one address per profiling bucket.
93
94       The value of bufsiz is computed as follows:
95
96       bufsiz = (end - start)*(bucket_size/2)*(scale/65536) where
97
98       bufsiz - the size of the buffer in bytes
99
100       end, start - the ending and starting addresses of the profiled region
101
102       bucket_size  -  the size of each bucket in bytes; 2, 4, or 8 as defined
103       in flags
104
105       scale - as defined above
106
107       EventSet -- The PAPI EventSet to profile. This EventSet  is  marked  as
108       profiling-ready,   but   profiling   doesn't  actually  start  until  a
109       PAPI_start() call is issued.
110
111       EventCode -- Code of the Event in the EventSet to profile.  This  event
112       must already be a member of the EventSet.
113
114       threshold  -- minimum number of events that must occur before the PC is
115       sampled. If hardware overflow is supported  for  your  substrate,  this
116       threshold will trigger an interrupt when reached.  Otherwise, the coun‐
117       ters will be sampled periodically and the PC will be recorded  for  the
118       first  sample  that exceeds the threshold. If the value of threshold is
119       0, profiling will be disabled for this event.
120
121
122       flags -- bit pattern to control profiling behavior. Defined values  are
123       shown in the table below:
124
125       ┌───────────────────────────────────────────────────┐
126Defined bits for the flags variable
127       ├──────────────────────┬────────────────────────────┤
128PAPI_PROFIL_POSIX     │ Default type of profiling, │
129       │                      │ similar to profil(3).      │
130       ├──────────────────────┼────────────────────────────┤
131PAPI_PROFIL_RANDOM    │ Drop a random 25%  of  the │
132       │                      │ samples.                   │
133       ├──────────────────────┼────────────────────────────┤
134PAPI_PROFIL_WEIGHTED  │ Weight   the   samples  by │
135       │                      │ their value.               │
136       ├──────────────────────┼────────────────────────────┤
137PAPI_PROFIL_COMPRESS  │ Ignore samples  as  values │
138       │                      │ in  the  hash  buckets get │
139       │                      │ big.                       │
140       ├──────────────────────┼────────────────────────────┤
141PAPI_PROFIL_BUCKET_16 │ Use  unsigned  short   (16 │
142       │                      │ bit)  buckets, This is the │
143       │                      │ default bucket.            │
144       ├──────────────────────┼────────────────────────────┤
145PAPI_PROFIL_BUCKET_32 │ Use unsigned int (32  bit) │
146       │                      │ buckets.                   │
147       ├──────────────────────┼────────────────────────────┤
148PAPI_PROFIL_BUCKET_64 │ Use unsigned long long (64 │
149       │                      │ bit) buckets.              │
150       ├──────────────────────┼────────────────────────────┤
151PAPI_PROFIL_FORCE_SW  │ Force software overflow in │
152       │                      │ profiling.                 │
153       ├──────────────────────┼────────────────────────────┤
154       │                      │                            │
155       └──────────────────────┴────────────────────────────┘
156

RETURN VALUES

158       On success, this function returns PAPI_OK.
159        On error, a non-zero error code is returned.
160
161

ERRORS

163       PAPI_EINVAL
164              One or more of the arguments is invalid.
165
166       PAPI_ENOMEM
167              Insufficient memory to complete the operation.
168
169       PAPI_ENOEVST
170              The EventSet specified does not exist.
171
172       PAPI_EISRUN
173              The EventSet is currently counting events.
174
175       PAPI_ECNFLCT
176              The  underlying  counter  hardware  can not count this event and
177              other events in the EventSet simultaneously.
178
179       PAPI_ENOEVNT
180              The PAPI preset is not available on the underlying hardware.
181
182

EXAMPLES

184       int retval;
185       unsigned long length;
186       PAPI_exe_info_t *prginfo;
187       unsigned short *profbuf;
188
189       if ((prginfo = PAPI_get_executable_info()) == NULL)
190         handle_error(1);
191
192       length = (unsigned long)(prginfo->text_end - prginfo->text_start);
193
194       profbuf = (unsigned short *)malloc(length);
195       if (profbuf == NULL)
196         handle_error(1);
197       memset(profbuf,0x00,length);
198        .
199        .
200        .
201       if ((retval = PAPI_profil(profbuf, length, start, 65536, EventSet,
202                       PAPI_FP_INS, 1000000, PAPI_PROFIL_POSIX | PAPI_PROFIL_BUCKET_16)) != PAPI_OK)
203         handle_error(retval);
204
205

BUGS

207       If you call PAPI_profil, PAPI allocates buffer space that will  not  be
208       freed if you call PAPI_shutdown or PAPI_cleanup_eventset.  To clean all
209       memory, you must call PAPI_profil on the Events with a 0 threshold.
210
211

SEE ALSO

213       PAPI_sprofil(3), PAPI_overflow(3), PAPI_get_executable_info(3)
214
215
216
217
218PAPI Programmer's Reference     September, 2004                 PAPI_profil(3)
Impressum