1gprof(1) User Commands gprof(1)
2
3
4
6 gprof - display call-graph profile data
7
9 gprof [-abcCDlsz] [-e function-name] [-E function-name]
10 [-f function-name] [-F function-name]
11 [image-file [profile-file...]]
12 [-n number of functions]
13
14
16 The gprof utility produces an execution profile of a program. The
17 effect of called routines is incorporated in the profile of each call‐
18 er. The profile data is taken from the call graph profile file that is
19 created by programs compiled with the -xpg option of cc(1), or by the
20 -pg option with other compilers, or by setting the LD_PROFILE environ‐
21 ment variable for shared objects. See ld.so.1(1). These compiler
22 options also link in versions of the library routines which are com‐
23 piled for profiling. The symbol table in the executable image file
24 image-file (a.out by default) is read and correlated with the call
25 graph profile file profile-file (gmon.out by default).
26
27
28 First, execution times for each routine are propagated along the edges
29 of the call graph. Cycles are discovered, and calls into a cycle are
30 made to share the time of the cycle. The first listing shows the func‐
31 tions sorted according to the time they represent, including the time
32 of their call graph descendants. Below each function entry is shown
33 its (direct) call-graph children and how their times are propagated to
34 this function. A similar display above the function shows how this
35 function's time and the time of its descendants are propagated to its
36 (direct) call-graph parents.
37
38
39 Cycles are also shown, with an entry for the cycle as a whole and a
40 listing of the members of the cycle and their contributions to the time
41 and call counts of the cycle.
42
43
44 Next, a flat profile is given, similar to that provided by prof(1).
45 This listing gives the total execution times and call counts for each
46 of the functions in the program, sorted by decreasing time. Finally, an
47 index is given, which shows the correspondence between function names
48 and call-graph profile index numbers.
49
50
51 A single function may be split into subfunctions for profiling by means
52 of the MARK macro. See prof(5).
53
54
55 Beware of quantization errors. The granularity of the sampling is
56 shown, but remains statistical at best. It is assumed that the time
57 for each execution of a function can be expressed by the total time for
58 the function divided by the number of times the function is called.
59 Thus the time propagated along the call-graph arcs to parents of that
60 function is directly proportional to the number of times that arc is
61 traversed.
62
63
64 The profiled program must call exit(2) or return normally for the pro‐
65 filing information to be saved in the gmon.out file.
66
68 The following options are supported:
69
70 -a Suppress printing statically declared functions. If
71 this option is given, all relevant information about
72 the static function (for instance, time samples,
73 calls to other functions, calls from other func‐
74 tions) belongs to the function loaded just before
75 the static function in the a.out file.
76
77
78 -b Brief. Suppress descriptions of each field in the
79 profile.
80
81
82 -c Discover the static call-graph of the program by a
83 heuristic which examines the text space of the
84 object file. Static-only parents or children are
85 indicated with call counts of 0. Note that for
86 dynamically linked executables, the linked shared
87 objects' text segments are not examined.
88
89
90 -C Demangle C++ symbol names before printing them out.
91
92
93 -D Produce a profile file gmon.sum that represents the
94 difference of the profile information in all speci‐
95 fied profile files. This summary profile file may
96 be given to subsequent executions of gprof (also
97 with -D) to summarize profile data across several
98 runs of an a.out file. See also the -s option.
99
100 As an example, suppose function A calls function B
101 n times in profile file gmon.sum, and m times in
102 profile file gmon.out. With -D, a new gmon.sum file
103 will be created showing the number of calls from A
104 to B as n-m.
105
106
107 -efunction-name Suppress printing the graph profile entry for rou‐
108 tine function-name and all its descendants (unless
109 they have other ancestors that are not suppressed).
110 More than one -e option may be given. Only one
111 function-name may be given with each -e option.
112
113
114 -Efunction-name Suppress printing the graph profile entry for rou‐
115 tine function-name (and its descendants) as -e,
116 below, and also exclude the time spent in function-
117 name (and its descendants) from the total and per‐
118 centage time computations. More than one -E option
119 may be given. For example:
120
121 -E mcount -E mcleanup
122
123 is the default.
124
125
126 -ffunction-name Print the graph profile entry only for routine func‐
127 tion-name and its descendants. More than one -f
128 option may be given. Only one function-name may be
129 given with each -f option.
130
131
132 -Ffunction-name Print the graph profile entry only for routine func‐
133 tion-name and its descendants (as -f, below) and
134 also use only the times of the printed routines in
135 total time and percentage computations. More than
136 one -F option may be given. Only one function-name
137 may be given with each -F option. The -F option
138 overrides the -E option.
139
140
141 -l Suppress the reporting of graph profile entries for
142 all local symbols. This option would be the equiva‐
143 lent of placing all of the local symbols for the
144 specified executable image on the -E exclusion list.
145
146
147 -n Limits the size of flat and graph profile listings
148 to the top n offending functions.
149
150
151 -s Produce a profile file gmon.sum which represents the
152 sum of the profile information in all of the speci‐
153 fied profile files. This summary profile file may
154 be given to subsequent executions of gprof (also
155 with -s) to accumulate profile data across several
156 runs of an a.out file. See also the -D option.
157
158
159 -z Display routines which have zero usage (as indicated
160 by call counts and accumulated time). This is useful
161 in conjunction with the -c option for discovering
162 which routines were never called. Note that this has
163 restricted use for dynamically linked executables,
164 since shared object text space will not be examined
165 by the -c option.
166
167
169 PROFDIR If this environment variable contains a value, place profil‐
170 ing output within that directory, in a file named pid.pro‐
171 gramname. pid is the process ID and programname is the name
172 of the program being profiled, as determined by removing any
173 path prefix from the argv[0] with which the program was
174 called. If the variable contains a null value, no profiling
175 output is produced. Otherwise, profiling output is placed
176 in the file gmon.out.
177
178
180 a.out executable file containing namelist
181
182
183 gmon.out dynamic call-graph and profile
184
185
186 gmon.sum summarized dynamic call-graph and profile
187
188
189 $PROFDIR/pid.programname
190
191
193 See attributes(5) for descriptions of the following attributes:
194
195
196
197
198 ┌─────────────────────────────┬─────────────────────────────┐
199 │ ATTRIBUTE TYPE │ ATTRIBUTE VALUE │
200 ├─────────────────────────────┼─────────────────────────────┤
201 │Availability │SUNWbtool │
202 └─────────────────────────────┴─────────────────────────────┘
203
205 cc(1), ld.so.1(1), prof(1), exit(2), pcsample(2), profil(2), mal‐
206 loc(3C), malloc(3MALLOC), monitor(3C), attributes(5), prof(5)
207
208
209 Graham, S.L., Kessler, P.B., McKusick, M.K., gprof: A Call Graph Execu‐
210 tion Profiler Proceedings of the SIGPLAN '82 Symposium on Compiler Con‐
211 struction, SIGPLAN Notices, Vol. 17, No. 6, pp. 120-126, June 1982.
212
213
214 Linker and Libraries Guide
215
217 If the executable image has been stripped and does not have the .symtab
218 symbol table, gprof reads the global dynamic symbol tables .dynsym and
219 .SUNW_ldynsym, if present. The symbols in the dynamic symbol tables
220 are a subset of the symbols that are found in .symtab. The .dynsym sym‐
221 bol table contains the global symbols used by the runtime linker.
222 .SUNW_ldynsym augments the information in .dynsym with local function
223 symbols. In the case where .dynsym is found and .SUNW_ldynsym is not,
224 only the information for the global symbols is available. Without
225 local symbols, the behavior is as described for the -a option.
226
227
228 LD_LIBRARY_PATH must not contain /usr/lib as a component when compiling
229 a program for profiling. If LD_LIBRARY_PATH contains /usr/lib, the
230 program will not be linked correctly with the profiling versions of
231 the system libraries in /usr/lib/libp.
232
233
234 The times reported in successive identical runs may show variances
235 because of varying cache-hit ratios that result from sharing the cache
236 with other processes. Even if a program seems to be the only one using
237 the machine, hidden background or asynchronous processes may blur the
238 data. In rare cases, the clock ticks initiating recording of the pro‐
239 gram counter may beat with loops in a program, grossly distorting mea‐
240 surements. Call counts are always recorded precisely, however.
241
242
243 Only programs that call exit or return from main are guaranteed to pro‐
244 duce a profile file, unless a final call to monitor is explicitly
245 coded.
246
247
248 Functions such as mcount(), _mcount(), moncontrol(), _moncontrol(),
249 monitor(), and _monitor() may appear in the gprof report. These func‐
250 tions are part of the profiling implementation and thus account for
251 some amount of the runtime overhead. Since these functions are not
252 present in an unprofiled application, time accumulated and call counts
253 for these functions may be ignored when evaluating the performance of
254 an application.
255
256 64-bit profiling
257 64-bit profiling may be used freely with dynamically linked executa‐
258 bles, and profiling information is collected for the shared objects if
259 the objects are compiled for profiling. Care must be applied to inter‐
260 pret the profile output, since it is possible for symbols from differ‐
261 ent shared objects to have the same name. If name duplication occurs in
262 the profile output, the module id prefix before the symbol name in the
263 symbol index listing can be used to identify the appropriate module for
264 the symbol.
265
266
267 When using the -s or -Doption to sum multiple profile files, care must
268 be taken not to mix 32-bit profile files with 64-bit profile files.
269
270 32-bit profiling
271 32-bit profiling may be used with dynamically linked executables, but
272 care must be applied. In 32-bit profiling, shared objects cannot be
273 profiled with gprof. Thus, when a profiled, dynamically linked program
274 is executed, only the main portion of the image is sampled. This means
275 that all time spent outside of the main object, that is, time spent in
276 a shared object, will not be included in the profile summary; the total
277 time reported for the program may be less than the total time used by
278 the program.
279
280
281 Because the time spent in a shared object cannot be accounted for, the
282 use of shared objects should be minimized whenever a program is pro‐
283 filed with gprof. If desired, the program should be linked to the pro‐
284 filed version of a library (or to the standard archive version if no
285 profiling version is available), instead of the shared object to get
286 profile information on the functions of a library. Versions of profiled
287 libraries may be supplied with the system in the /usr/lib/libp direc‐
288 tory. Refer to compiler driver documentation on profiling.
289
290
291 Consider an extreme case. A profiled program dynamically linked with
292 the shared C library spends 100 units of time in some libc routine,
293 say, malloc(). Suppose malloc() is called only from routine B and B
294 consumes only 1 unit of time. Suppose further that routine A consumes
295 10 units of time, more than any other routine in the main (profiled)
296 portion of the image. In this case, gprof will conclude that most of
297 the time is being spent in A and almost no time is being spent in B.
298 From this it will be almost impossible to tell that the greatest
299 improvement can be made by looking at routine B and not routine A. The
300 value of the profiler in this case is severely degraded; the solution
301 is to use archives as much as possible for profiling.
302
304 Parents which are not themselves profiled will have the time of their
305 profiled children propagated to them, but they will appear to be spon‐
306 taneously invoked in the call-graph listing, and will not have their
307 time propagated further. Similarly, signal catchers, even though pro‐
308 filed, will appear to be spontaneous (although for more obscure rea‐
309 sons). Any profiled children of signal catchers should have their times
310 propagated properly, unless the signal catcher was invoked during the
311 execution of the profiling routine, in which case all is lost.
312
313
314
315SunOS 5.11 8 Feb 2007 gprof(1)