1fprof(3)                   Erlang Module Definition                   fprof(3)
2
3
4

NAME

6       fprof  -  A Time Profiling Tool using trace to file for minimal runtime
7       performance impact.
8

DESCRIPTION

10       This module is used to profile a program to find out how the  execution
11       time  is  used.  Trace  to file is used to minimize runtime performance
12       impact.
13
14       The fprof module uses tracing to collect profiling data, hence there is
15       no  need  for special compilation of any module to be profiled. When it
16       starts tracing, fprof will erase all previous tracing in the  node  and
17       set the necessary trace flags on the profiling target processes as well
18       as local call trace on all functions in all loaded modules and all mod‐
19       ules  to  be loaded. fprof erases all tracing in the node when it stops
20       tracing.
21
22       fprof presents both own time i.e how much time a function has used  for
23       its own execution, and accumulated time i.e including called functions.
24       All presented times are collected using trace timestamps.  fprof  tries
25       to  collect  cpu  time  timestamps, if the host machine OS supports it.
26       Therefore the times may be wallclock times and OS scheduling will  ran‐
27       domly strike all called functions in a presumably fair way.
28
29       If,  however, the profiling time is short, and the host machine OS does
30       not support high resolution cpu time measurements, some few OS schedul‐
31       ings  may  show  up  as ridiculously long execution times for functions
32       doing practically nothing. An example of a function more or  less  just
33       composing a tuple in about 100 times the normal execution time has been
34       seen, and when the tracing was repeated, the execution time became nor‐
35       mal.
36
37       Profiling is essentially done in 3 steps:
38
39         1:
40           Tracing; to file, as mentioned in the previous paragraph. The trace
41           contains entries for function calls, returns to  function,  process
42           scheduling,  other process related (spawn, etc) events, and garbage
43           collection. All trace entries are timestamped.
44
45         2:
46           Profiling; the trace file is read, the execution call stack is sim‐
47           ulated,  and raw profile data is calculated from the simulated call
48           stack and the trace timestamps. The profile data is stored  in  the
49           fprof  server  state. During this step the trace data may be dumped
50           in text format to file or console.
51
52         3:
53           Analysing; the raw profile data is sorted, filtered and  dumped  in
54           text  format either to file or console. The text format intended to
55           be both readable for a human reader, as well as parsable  with  the
56           standard erlang parsing tools.
57
58       Since  fprof uses trace to file, the runtime performance degradation is
59       minimized, but still far from negligible, especially for programs  that
60       use  the  filesystem  heavily  by themselves. Where you place the trace
61       file is also important, e.g on Solaris /tmp is usually  a  good  choice
62       since  it  is  essentially  a RAM disk, while any NFS (network) mounted
63       disk is a bad idea.
64
65       fprof can also skip the file step and trace to a  tracer  process  that
66       does the profiling in runtime.
67

EXPORTS

69       start() -> {ok, Pid} | {error, {already_started, Pid}}
70
71              Types:
72
73                 Pid = pid()
74
75              Starts the fprof server.
76
77              Note  that  it seldom needs to be started explicitly since it is
78              automatically started by  the  functions  that  need  a  running
79              server.
80
81       stop() -> ok
82
83              Same as stop(normal).
84
85       stop(Reason) -> ok
86
87              Types:
88
89                 Reason = term()
90
91              Stops the fprof server.
92
93              The  supplied  Reason  becomes  the  exit  reason for the server
94              process. Default Any Reason other than kill sends a  request  to
95              the server and waits for it to clean up, reply and exit. If Rea‐
96              son is kill, the server is bluntly killed.
97
98              If the fprof server is not running, this function returns  imme‐
99              diately with the same return value.
100
101          Note:
102              When  the fprof server is stopped the collected raw profile data
103              is lost.
104
105
106       apply(Func, Args) -> term()
107
108              Types:
109
110                 Func = function() | {Module, Function}
111                 Args = [term()]
112                 Module = atom()
113                 Function = atom()
114
115              Same as apply(Func, Args, []).
116
117       apply(Module, Function, Args) -> term()
118
119              Types:
120
121                 Args = [term()]
122                 Module = atom()
123                 Function = atom()
124
125              Same as apply({Module, Function}, Args, []).
126
127       apply(Func, Args, OptionList) -> term()
128
129              Types:
130
131                 Func = function() | {Module, Function}
132                 Args = [term()]
133                 OptionList = [Option]
134                 Module = atom()
135                 Function = atom()
136                 Option = continue | start | {procs, PidList} |  TraceStartOp‐
137                 tion
138
139              Calls erlang:apply(Func, Args) surrounded by trace([start, ...])
140              and trace(stop).
141
142              Some effort is made to keep the  trace  clean  from  unnecessary
143              trace  messages;  tracing  is started and stopped from a spawned
144              process while the erlang:apply/2 call is  made  in  the  current
145              process,  only surrounded by receive and send statements towards
146              the trace starting process. The  trace  starting  process  exits
147              when not needed any more.
148
149              The  TraceStartOption  is  any  option  allowed for trace/1. The
150              options [start, {procs, [self() | PidList]} | OptList] are given
151              to trace/1, where OptList is OptionList with continue, start and
152              {procs, _} options removed.
153
154              The continue option inhibits the call to trace(stop) and  leaves
155              it up to the caller to stop tracing at a suitable time.
156
157       apply(Module, Function, Args, OptionList) -> term()
158
159              Types:
160
161                 Module = atom()
162                 Function = atom()
163                 Args = [term()]
164
165              Same as apply({Module, Function}, Args, OptionList).
166
167              OptionList is an option list allowed for apply/3.
168
169       trace(start,  Filename)  ->  ok | {error, Reason} | {'EXIT', ServerPid,
170       Reason}
171
172              Types:
173
174                 Reason = term()
175
176              Same as trace([start, {file, Filename}]).
177
178       trace(verbose, Filename) -> ok | {error, Reason} | {'EXIT',  ServerPid,
179       Reason}
180
181              Types:
182
183                 Reason = term()
184
185              Same as trace([start, verbose, {file, Filename}]).
186
187       trace(OptionName,  OptionValue)  ->  ok  |  {error,  Reason} | {'EXIT',
188       ServerPid, Reason}
189
190              Types:
191
192                 OptionName = atom()
193                 OptionValue = term()
194                 Reason = term()
195
196              Same as trace([{OptionName, OptionValue}]).
197
198       trace(verbose) -> ok | {error, Reason} | {'EXIT', ServerPid, Reason}
199
200              Types:
201
202                 Reason = term()
203
204              Same as trace([start, verbose]).
205
206       trace(OptionName) -> ok | {error, Reason} | {'EXIT', ServerPid, Reason}
207
208              Types:
209
210                 OptionName = atom()
211                 Reason = term()
212
213              Same as trace([OptionName]).
214
215       trace({OptionName, OptionValue}) -> ok |  {error,  Reason}  |  {'EXIT',
216       ServerPid, Reason}
217
218              Types:
219
220                 OptionName = atom()
221                 OptionValue = term()
222                 Reason = term()
223
224              Same as trace([{OptionName, OptionValue}]).
225
226       trace([Option]) -> ok | {error, Reason} | {'EXIT', ServerPid, Reason}
227
228              Types:
229
230                 Option = start | stop | {procs, PidSpec} | {procs, [PidSpec]}
231                 | verbose | {verbose, bool()} | file  |  {file,  Filename}  |
232                 {tracer, Tracer}
233                 PidSpec = pid() | atom()
234                 Tracer = pid() | port()
235                 Reason = term()
236
237              Starts or stops tracing.
238
239              PidSpec  and  Tracer  are used in calls to erlang:trace(PidSpec,
240              true, [{tracer, Tracer} | Flags]), and Filename is used to  call
241              dbg:trace_port(file, Filename). Please see the appropriate docu‐
242              mentation.
243
244              Option description:
245
246                stop:
247                  Stops a running fprof trace and clears all tracing from  the
248                  node. Either option stop or start must be specified, but not
249                  both.
250
251                start:
252                  Clears all tracing from the node  and  starts  a  new  fprof
253                  trace.  Either  option  start or stop must be specified, but
254                  not both.
255
256                verbose| {verbose, bool()}:
257                  The options verbose or {verbose, true} adds some trace flags
258                  that  fprof  does  not need, but that may be interesting for
259                  general debugging purposes. This option is only allowed with
260                  the start option.
261
262                cpu_time| {cpu_time, bool()}:
263                  The  options  cpu_time  or  {cpu_time, true> makes the time‐
264                  stamps in the trace be in CPU time instead of wallclock time
265                  which  is  the default. This option is only allowed with the
266                  start option.
267
268                {procs, PidSpec}| {procs, [PidSpec]}:
269                  Specifies which processes that  shall  be  traced.  If  this
270                  option is not given, the calling process is traced. All pro‐
271                  cesses spawned by the traced processes are also traced. This
272                  option is only allowed with the start option.
273
274                file| {file, Filename}:
275                  Specifies  the  filename of the trace. If the option file is
276                  given,  or  none  of  these  options  are  given,  the  file
277                  "fprof.trace"  is used. This option is only allowed with the
278                  start option, but not with the {tracer, Tracer} option.
279
280                {tracer, Tracer}:
281                  Specifies that trace  to  process  or  port  shall  be  done
282                  instead  of  trace to file. This option is only allowed with
283                  the start option, but not with the {file, Filename} option.
284
285       profile() -> ok | {error, Reason} | {'EXIT', ServerPid, Reason}
286
287              Types:
288
289                 Reason = term()
290
291              Same as profile([]).
292
293       profile(OptionName, OptionValue) -> ok |  {error,  Reason}  |  {'EXIT',
294       ServerPid, Reason}
295
296              Types:
297
298                 OptionName = atom()
299                 OptionValue = term()
300                 Reason = term()
301
302              Same as profile([{OptionName, OptionValue}]).
303
304       profile(OptionName)  -> ok | {error, Reason} | {'EXIT', ServerPid, Rea‐
305       son}
306
307              Types:
308
309                 OptionName = atom()
310                 Reason = term()
311
312              Same as profile([OptionName]).
313
314       profile({OptionName, OptionValue}) -> ok | {error, Reason}  |  {'EXIT',
315       ServerPid, Reason}
316
317              Types:
318
319                 OptionName = atom()
320                 OptionValue = term()
321                 Reason = term()
322
323              Same as profile([{OptionName, OptionValue}]).
324
325       profile([Option])  ->  ok  |  {ok, Tracer} | {error, Reason} | {'EXIT',
326       ServerPid, Reason}
327
328              Types:
329
330                 Option = file | {file, Filename} |  dump  |  {dump,  Dump}  |
331                 append | start | stop
332                 Dump = pid() | Dumpfile | []
333                 Tracer = pid()
334                 Reason = term()
335
336              Compiles a trace into raw profile data held by the fprof server.
337
338              Dumpfile  is  used  to call file:open/2, and Filename is used to
339              call dbg:trace_port(file, Filename). Please see the  appropriate
340              documentation.
341
342              Option description:
343
344                file| {file, Filename}:
345                  Reads the file Filename and creates raw profile data that is
346                  stored in RAM by the fprof server. If  the  option  file  is
347                  given,  or  none  of  these  options  are  given,  the  file
348                  "fprof.trace" is read. The call will return when  the  whole
349                  trace  has been read with the return value ok if successful.
350                  This option is not allowed with the start or stop options.
351
352                dump| {dump, Dump}:
353                  Specifies the destination for the trace text dump.  If  this
354                  option is not given, no dump is generated, if it is dump the
355                  destination will be the caller's group leader, otherwise the
356                  destination  Dump  is  either  the pid of an I/O device or a
357                  filename. And, finally, if the filename is [] - "fprof.dump"
358                  is  used  instead.  This option is not allowed with the stop
359                  option.
360
361                append:
362                  Causes the trace text dump to be appended to the destination
363                  file.  This option is only allowed with the {dump, Dumpfile}
364                  option.
365
366                start:
367                  Starts a tracer process that profiles trace data in runtime.
368                  The  call will return immediately with the return value {ok,
369                  Tracer} if successful. This option is not allowed  with  the
370                  stop, file or {file, Filename} options.
371
372                stop:
373                  Stops  the  tracer  process that profiles trace data in run‐
374                  time. The return value will be value ok if successful.  This
375                  option  is  not allowed with the start, file or {file, File‐
376                  name} options.
377
378       analyse() -> ok | {error, Reason} | {'EXIT', ServerPid, Reason}
379
380              Types:
381
382                 Reason = term()
383
384              Same as analyse([]).
385
386       analyse(OptionName, OptionValue) -> ok |  {error,  Reason}  |  {'EXIT',
387       ServerPid, Reason}
388
389              Types:
390
391                 OptionName = atom()
392                 OptionValue = term()
393                 Reason = term()
394
395              Same as analyse([{OptionName, OptionValue}]).
396
397       analyse(OptionName)  -> ok | {error, Reason} | {'EXIT', ServerPid, Rea‐
398       son}
399
400              Types:
401
402                 OptionName = atom()
403                 Reason = term()
404
405              Same as analyse([OptionName]).
406
407       analyse({OptionName, OptionValue}) -> ok | {error, Reason}  |  {'EXIT',
408       ServerPid, Reason}
409
410              Types:
411
412                 OptionName = atom()
413                 OptionValue = term()
414                 Reason = term()
415
416              Same as analyse([{OptionName, OptionValue}]).
417
418       analyse([Option]) -> ok | {error, Reason} | {'EXIT', ServerPid, Reason}
419
420              Types:
421
422                 Option  = dest | {dest, Dest} | append | {cols, Cols} | call‐
423                 ers | {callers, bool()} | no_callers  |  {sort,  SortSpec}  |
424                 totals  |  {totals,  bool()}  | details | {details, bool()} |
425                 no_details
426                 Dest = pid() | Destfile
427                 Cols = integer() >= 80
428                 SortSpec = acc | own
429                 Reason = term()
430
431              Analyses raw profile data in the fprof server. If  called  while
432              there  is  no raw profile data available, {error, no_profile} is
433              returned.
434
435              Destfile is used to call file:open/2. Please see the appropriate
436              documentation.
437
438              Option description:
439
440                dest| {dest, Dest}:
441                  Specifies  the  destination for the analysis. If this option
442                  is not given or it is dest,  the  destination  will  be  the
443                  caller's  group  leader,  otherwise  the destination Dest is
444                  either the pid() of  an  I/O  device  or  a  filename.  And,
445                  finally,  if  the  filename is [] - "fprof.analysis" is used
446                  instead.
447
448                append:
449                  Causes the analysis to be appended to the destination  file.
450                  This  option  is  only  allowed  with  the  {dest, Destfile}
451                  option.
452
453                {cols, Cols}:
454                  Specifies the number of columns in  the  analysis  text.  If
455                  this option is not given the number of columns is set to 80.
456
457                callers| {callers, true}:
458                  Prints  callers and called information in the analysis. This
459                  is the default.
460
461                {callers, false}| no_callers:
462                  Suppresses the printing of callers and called information in
463                  the analysis.
464
465                {sort, SortSpec}:
466                  Specifies  if the analysis should be sorted according to the
467                  ACC column, which is the default, or  the  OWN  column.  See
468                  Analysis Format below.
469
470                totals| {totals, true}:
471                  Includes  a section containing call statistics for all calls
472                  regardless of process, in the analysis.
473
474                {totals, false}:
475                  Supresses the totals section in the analysis, which  is  the
476                  default.
477
478                details| {details, true}:
479                  Prints  call  statistics  for  each process in the analysis.
480                  This is the default.
481
482                {details, false}| no_details:
483                  Suppresses the call statistics for  each  process  from  the
484                  analysis.
485

ANALYSIS FORMAT

487       This  section  describes  the output format of the analyse command. See
488       analyse/0.
489
490       The format is parsable with the standard Erlang parsing tools  erl_scan
491       and  erl_parse,  file:consult/1  or  io:read/2. The parse format is not
492       explained here - it should be easy for the interested to  try  it  out.
493       Note that some flags to analyse/1 will affect the format.
494
495       The following example was run on OTP/R8 on Solaris 8, all OTP internals
496       in this example are very version dependent.
497
498       As an example, we will use the following function, that you may  recog‐
499       nise  as  a  slightly  modified  benchmark  function  from  the manpage
500       file(3):
501
502       -module(foo).
503       -export([create_file_slow/2]).
504
505       create_file_slow(Name, N) when integer(N), N >= 0 ->
506           {ok, FD} =
507               file:open(Name, [raw, write, delayed_write, binary]),
508           if N > 256 ->
509                   ok = file:write(FD,
510                                   lists:map(fun (X) -> <<X:32/unsigned>> end,
511                                   lists:seq(0, 255))),
512                   ok = create_file_slow(FD, 256, N);
513              true ->
514                   ok = create_file_slow(FD, 0, N)
515           end,
516           ok = file:close(FD).
517
518       create_file_slow(FD, M, M) ->
519           ok;
520       create_file_slow(FD, M, N) ->
521           ok = file:write(FD, <<M:32/unsigned>>),
522           create_file_slow(FD, M+1, N).
523
524       Let us have a look at the printout after running:
525
526       1> fprof:apply(foo, create_file_slow, [junk, 1024]).
527       2> fprof:profile().
528       3> fprof:analyse().
529
530       The printout starts with:
531
532       %% Analysis results:
533       {  analysis_options,
534        [{callers, true},
535         {sort, acc},
536         {totals, false},
537         {details, true}]}.
538
539       %                                       CNT       ACC       OWN
540       [{ totals,                             9627, 1691.119, 1659.074}].  %%%
541
542       The CNT column shows the total number of function calls that was  found
543       in  the  trace.  In  the ACC column is the total time of the trace from
544       first timestamp to last. And in the OWN column is the sum of the execu‐
545       tion  time  in functions found in the trace, not including called func‐
546       tions. In this case it is very close to the ACC time since the emulator
547       had practically nothing else to do than to execute our test program.
548
549       All time values in the printout are in milliseconds.
550
551       The printout continues:
552
553       %                                       CNT       ACC       OWN
554       [{ "<0.28.0>",                         9627,undefined, 1659.074}].   %%
555
556       This  is the printout header of one process. The printout contains only
557       this one process since we did fprof:apply/3 which traces only the  cur‐
558       rent  process.  Therefore the CNT and OWN columns perfectly matches the
559       totals above. The ACC column is undefined since summing the  ACC  times
560       of  all  calls  in the process makes no sense - you would get something
561       like the ACC value from totals above multiplied by the average depth of
562       the call stack, or something.
563
564       All  paragraphs  up  to  the next process header only concerns function
565       calls within this process.
566
567       Now we come to something more interesting:
568
569       {[{undefined,                             0, 1691.076,    0.030}],
570        { {fprof,apply_start_stop,4},            0, 1691.076,    0.030},     %
571        [{{foo,create_file_slow,2},              1, 1691.046,    0.103},
572         {suspend,                               1,    0.000,    0.000}]}.
573
574       {[{{fprof,apply_start_stop,4},            1, 1691.046,    0.103}],
575        { {foo,create_file_slow,2},              1, 1691.046,    0.103},     %
576        [{{file,close,1},                        1, 1398.873,    0.019},
577         {{foo,create_file_slow,3},              1,  249.678,    0.029},
578         {{file,open,2},                         1,   20.778,    0.055},
579         {{lists,map,2},                         1,   16.590,    0.043},
580         {{lists,seq,2},                         1,    4.708,    0.017},
581         {{file,write,2},                        1,    0.316,    0.021}]}.
582
583       The printout consists of one paragraph per called function.  The  func‐
584       tion  marked  with  '%'  is  the  one the paragraph concerns - foo:cre‐
585       ate_file_slow/2. Above the marked function are the calling functions  -
586       those  that  has  called  the marked, and below are those called by the
587       marked function.
588
589       The paragraphs are per default sorted in decreasing order  of  the  ACC
590       column for the marked function. The calling list and called list within
591       one paragraph are also per default sorted in decreasing order of  their
592       ACC column.
593
594       The  columns  are:  CNT  -  the  number  of times the function has been
595       called, ACC - the time spent in the  function  including  called  func‐
596       tions,  and  OWN  - the time spent in the function not including called
597       functions.
598
599       The rows for the calling functions contain statistics  for  the  marked
600       function  with  the  constraint that only the occasions when a call was
601       made from the row's function to the marked function are accounted for.
602
603       The row for the marked function simply contains the sum of all  calling
604       rows.
605
606       The  rows  for  the  called functions contains statistics for the row's
607       function with the constraint that only the occasions when  a  call  was
608       made from the marked to the row's function are accounted for.
609
610       So,  we  see  that foo:create_file_slow/2 used very little time for its
611       own execution. It spent most of its time in file:close/1. The  function
612       foo:create_file_slow/3 that writes 3/4 of the file contents is the sec‐
613       ond biggest time thief.
614
615       We also see that the call to file:write/2 that writes 1/4 of  the  file
616       contents  takes very little time in itself. What takes time is to build
617       the data (lists:seq/2 and lists:map/2).
618
619       The function 'undefined' that has called fprof:apply_start_stop/4 is an
620       unknown  function  because  that call was not recorded in the trace. It
621       was    only    recorded    that    the    execution    returned    from
622       fprof:apply_start_stop/4  to  some  other  function  above  in the call
623       stack, or that the process exited from there.
624
625       Let us continue down the printout to find:
626
627       {[{{foo,create_file_slow,2},              1,  249.678,    0.029},
628         {{foo,create_file_slow,3},            768,    0.000,   23.294}],
629        { {foo,create_file_slow,3},            769,  249.678,   23.323},     %
630        [{{file,write,2},                      768,  220.314,   14.539},
631         {suspend,                              57,    6.041,    0.000},
632         {{foo,create_file_slow,3},            768,    0.000,   23.294}]}.
633
634       If you compare with the code you will  see  there  also  that  foo:cre‐
635       ate_file_slow/3 was called only from foo:create_file_slow/2 and itself,
636       and called only file:write/2, note the number of calls to file:write/2.
637       But  here  we see that suspend was called a few times. This is a pseudo
638       function that indicates that the process was suspended while  executing
639       in   foo:create_file_slow/3,   and   since   there  is  no  receive  or
640       erlang:yield/0 in the code, it must be Erlang  scheduling  suspensions,
641       or  the  trace file driver compensating for large file write operations
642       (these are regarded as a schedule out followed by a schedule in to  the
643       same process).
644
645       Let us find the suspend entry:
646
647       {[{{file,write,2},                       53,    6.281,    0.000},
648         {{foo,create_file_slow,3},             57,    6.041,    0.000},
649         {{prim_file,drv_command,4},            50,    4.582,    0.000},
650         {{prim_file,drv_get_response,1},       34,    2.986,    0.000},
651         {{lists,map,2},                        10,    2.104,    0.000},
652         {{prim_file,write,2},                  17,    1.852,    0.000},
653         {{erlang,port_command,2},              15,    1.713,    0.000},
654         {{prim_file,drv_command,2},            22,    1.482,    0.000},
655         {{prim_file,translate_response,2},     11,    1.441,    0.000},
656         {{prim_file,'-drv_command/2-fun-0-',1},  15,    1.340,    0.000},
657         {{lists,seq,4},                         3,    0.880,    0.000},
658         {{foo,'-create_file_slow/2-fun-0-',1},   5,    0.523,    0.000},
659         {{erlang,bump_reductions,1},            4,    0.503,    0.000},
660         {{prim_file,open_int_setopts,3},        1,    0.165,    0.000},
661         {{prim_file,i32,4},                     1,    0.109,    0.000},
662         {{fprof,apply_start_stop,4},            1,    0.000,    0.000}],
663        { suspend,                             299,   32.002,    0.000},     %
664        [ ]}.
665
666       We find no particulary long suspend times, so no function seems to have
667       waited in a receive statement. Actually,  prim_file:drv_command/4  con‐
668       tains  a  receive statement, but in this test program, the message lies
669       in the process receive buffer when the receive statement is entered. We
670       also see that the total suspend time for the test run is small.
671
672       The  suspend  pseudo  function  has got an OWN time of zero. This is to
673       prevent the process total OWN time from including time  in  suspension.
674       Whether suspend time is really ACC or OWN time is more of a philosophi‐
675       cal question.
676
677       Now we look at another interesting pseudo function, garbage_collect:
678
679       {[{{prim_file,drv_command,4},            25,    0.873,    0.873},
680         {{prim_file,write,2},                  16,    0.692,    0.692},
681         {{lists,map,2},                         2,    0.195,    0.195}],
682        { garbage_collect,                      43,    1.760,    1.760},     %
683        [ ]}.
684
685       Here we see that no function distinguishes itself  considerably,  which
686       is very normal.
687
688       The  garbage_collect  pseudo  function  has not got an OWN time of zero
689       like suspend, instead it is equal to the ACC time.
690
691       Garbage collect often occurs while a process is  suspended,  but  fprof
692       hides  this  fact  by  pretending that the suspended function was first
693       unsuspended and then garbage collected. Otherwise  the  printout  would
694       show  garbage_collect  being called from suspend but not which function
695       that might have caused the garbage collection.
696
697       Let us now get back to the test code:
698
699       {[{{foo,create_file_slow,3},            768,  220.314,   14.539},
700         {{foo,create_file_slow,2},              1,    0.316,    0.021}],
701        { {file,write,2},                      769,  220.630,   14.560},     %
702        [{{prim_file,write,2},                 769,  199.789,   22.573},
703         {suspend,                              53,    6.281,    0.000}]}.
704
705       Not unexpectedly, we see that file:write/2  was  called  from  foo:cre‐
706       ate_file_slow/3 and foo:create_file_slow/2. The number of calls in each
707       case as well as the used time  are  also  just  confirms  the  previous
708       results.
709
710       We  see  that  file:write/2  only  calls  prim_file:write/2, but let us
711       refrain from digging into the internals of the kernel application.
712
713       But, if we nevertheless do dig down we find the call to the  linked  in
714       driver that does the file operations towards the host operating system:
715
716       {[{{prim_file,drv_command,4},           772, 1458.356, 1456.643}],
717        { {erlang,port_command,2},             772, 1458.356, 1456.643},     %
718        [{suspend,                              15,    1.713,    0.000}]}.
719
720       This  is  86  %  of  the total run time, and as we saw before it is the
721       close operation the absolutely biggest contributor. We find a  compari‐
722       son ratio a little bit up in the call stack:
723
724       {[{{prim_file,close,1},                   1, 1398.748,    0.024},
725         {{prim_file,write,2},                 769,  174.672,   12.810},
726         {{prim_file,open_int,4},                1,   19.755,    0.017},
727         {{prim_file,open_int_setopts,3},        1,    0.147,    0.016}],
728        { {prim_file,drv_command,2},           772, 1593.322,   12.867},     %
729        [{{prim_file,drv_command,4},           772, 1578.973,   27.265},
730         {suspend,                              22,    1.482,    0.000}]}.
731
732       The time for file operations in the linked in driver distributes itself
733       as 1 % for open, 11 % for write and 87 % for close. All data is  proba‐
734       bly buffered in the operating system until the close.
735
736       The   unsleeping   reader   may   notice   that   the   ACC  times  for
737       prim_file:drv_command/2  and  prim_file:drv_command/4  is   not   equal
738       between  the  paragraphs  above, even though it is easy to believe that
739       prim_file:drv_command/2 is just a passthrough function.
740
741       The missing time can be found in the paragraph  for  prim_file:drv_com‐
742       mand/4  where  it  is  evident that not only prim_file:drv_command/2 is
743       called but also a fun:
744
745       {[{{prim_file,drv_command,2},           772, 1578.973,   27.265}],
746        { {prim_file,drv_command,4},           772, 1578.973,   27.265},     %
747        [{{erlang,port_command,2},             772, 1458.356, 1456.643},
748         {{prim_file,'-drv_command/2-fun-0-',1}, 772,   87.897,   12.736},
749         {suspend,                              50,    4.582,    0.000},
750         {garbage_collect,                      25,    0.873,    0.873}]}.
751
752       And  some  more  missing  time  can  be  explained  by  the  fact  that
753       prim_file:open_int/4  both  calls  prim_file:drv_command/2  directly as
754       well as through  prim_file:open_int_setopts/3,  which  complicates  the
755       picture.
756
757       {[{{prim_file,open,2},                    1,   20.309,    0.029},
758         {{prim_file,open_int,4},                1,    0.000,    0.057}],
759        { {prim_file,open_int,4},                2,   20.309,    0.086},     %
760        [{{prim_file,drv_command,2},             1,   19.755,    0.017},
761         {{prim_file,open_int_setopts,3},        1,    0.360,    0.032},
762         {{prim_file,drv_open,2},                1,    0.071,    0.030},
763         {{erlang,list_to_binary,1},             1,    0.020,    0.020},
764         {{prim_file,i32,1},                     1,    0.017,    0.017},
765         {{prim_file,open_int,4},                1,    0.000,    0.057}]}.
766       {[{{prim_file,open_int,4},                1,    0.360,    0.032},
767         {{prim_file,open_int_setopts,3},        1,    0.000,    0.016}],
768        { {prim_file,open_int_setopts,3},        2,    0.360,    0.048},     %
769        [{suspend,                               1,    0.165,    0.000},
770         {{prim_file,drv_command,2},             1,    0.147,    0.016},
771         {{prim_file,open_int_setopts,3},        1,    0.000,    0.016}]}.
772

NOTES

774       The  actual supervision of execution times is in itself a CPU intensive
775       activity. A message is written on the trace  file  for  every  function
776       call that is made by the profiled code.
777
778       The  ACC time calculation is sometimes difficult to make correct, since
779       it is difficult to define. This  happens  especially  when  a  function
780       occurs  in  several instances in the call stack, for example by calling
781       itself perhaps through other functions and perhaps even non-tail recur‐
782       sively.
783
784       To  produce  sensible  results,  fprof tries not to charge any function
785       more than once for ACC time. The  instance  highest  up  (with  longest
786       duration) in the call stack is chosen.
787
788       Sometimes  a  function may unexpectedly waste a lot (some 10 ms or more
789       depending on host machine OS) of OWN (and  ACC)  time,  even  functions
790       that  does  practically  nothing at all. The problem may be that the OS
791       has chosen to schedule out the Erlang  runtime  system  process  for  a
792       while, and if the OS does not support high resolution cpu time measure‐
793       ments fprof will use wallclock time for its calculations, and  it  will
794       appear as functions randomly burn virtual machine time.
795

SEE ALSO

797       dbg(3), eprof(3), erlang(3), io(3), Tools User's Guide
798
799
800
801Ericsson AB                     tools 2.11.2.1                        fprof(3)
Impressum