1fprof(3)                   Erlang Module Definition                   fprof(3)
2
3
4

NAME

6       fprof  -  A Time Profiling Tool using trace to file for minimal runtime
7       performance impact.
8

DESCRIPTION

10       This module is used to profile a program to find out how the  execution
11       time  is  used.  Trace  to file is used to minimize runtime performance
12       impact.
13
14       The fprof module uses tracing to collect profiling data, hence there is
15       no  need  for special compilation of any module to be profiled. When it
16       starts tracing, fprof will erase all previous tracing in the  node  and
17       set the necessary trace flags on the profiling target processes as well
18       as local call trace on all functions in all loaded modules and all mod‐
19       ules  to  be loaded. fprof erases all tracing in the node when it stops
20       tracing.
21
22       fprof presents both own time i.e how much time a function has used  for
23       its own execution, and accumulated time i.e including called functions.
24       All presented times are collected using trace timestamps.  fprof  tries
25       to  collect  cpu  time  timestamps, if the host machine OS supports it.
26       Therefore the times may be wallclock times and OS scheduling will  ran‐
27       domly strike all called functions in a presumably fair way.
28
29       If,  however, the profiling time is short, and the host machine OS does
30       not support high resolution cpu time measurements, some few OS schedul‐
31       ings  may  show  up  as ridiculously long execution times for functions
32       doing practically nothing. An example of a function more or  less  just
33       composing a tuple in about 100 times the normal execution time has been
34       seen, and when the tracing was repeated, the execution time became nor‐
35       mal.
36
37       Profiling is essentially done in 3 steps:
38
39         1:
40           Tracing; to file, as mentioned in the previous paragraph. The trace
41           contains entries for function calls, returns to  function,  process
42           scheduling,  other process related (spawn, etc) events, and garbage
43           collection. All trace entries are timestamped.
44
45         2:
46           Profiling; the trace file is read, the execution call stack is sim‐
47           ulated,  and raw profile data is calculated from the simulated call
48           stack and the trace timestamps. The profile data is stored  in  the
49           fprof  server  state. During this step the trace data may be dumped
50           in text format to file or console.
51
52         3:
53           Analysing; the raw profile data is sorted, filtered and  dumped  in
54           text  format either to file or console. The text format intended to
55           be both readable for a human reader, as well as parsable  with  the
56           standard erlang parsing tools.
57
58       Since  fprof uses trace to file, the runtime performance degradation is
59       minimized, but still far from negligible, especially for programs  that
60       use  the  filesystem  heavily  by themselves. Where you place the trace
61       file is also important, e.g on Solaris /tmp is usually  a  good  choice
62       since  it  is  essentially  a RAM disk, while any NFS (network) mounted
63       disk is a bad idea.
64
65       fprof can also skip the file step and trace to a  tracer  process  that
66       does the profiling in runtime.
67

EXPORTS

69       start() -> {ok, Pid} | {error, {already_started, Pid}}
70
71              Types:
72
73                 Pid = pid()
74
75              Starts the fprof server.
76
77              Note  that  it seldom needs to be started explicitly since it is
78              automatically started by  the  functions  that  need  a  running
79              server.
80
81       stop() -> ok
82
83              Same as stop(normal).
84
85       stop(Reason) -> ok
86
87              Types:
88
89                 Reason = term()
90
91              Stops the fprof server.
92
93              The  supplied  Reason  becomes  the  exit  reason for the server
94              process. Default Any Reason other than kill sends a  request  to
95              the server and waits for it to clean up, reply and exit. If Rea‐
96              son is kill, the server is bluntly killed.
97
98              If the fprof server is not running, this function returns  imme‐
99              diately with the same return value.
100
101          Note:
102              When  the fprof server is stopped the collected raw profile data
103              is lost.
104
105
106       apply(Func, Args) -> term()
107
108              Types:
109
110                 Func = function() | {Module, Function}
111                 Args = [term()]
112                 Module = atom()
113                 Function = atom()
114
115              Same as apply(Func, Args, []).
116
117       apply(Module, Function, Args) -> term()
118
119              Types:
120
121                 Args = [term()]
122                 Module = atom()
123                 Function = atom()
124
125              Same as apply({Module, Function}, Args, []).
126
127       apply(Func, Args, OptionList) -> term()
128
129              Types:
130
131                 Func = function() | {Module, Function}
132                 Args = [term()]
133                 OptionList = [Option]
134                 Module = atom()
135                 Function = atom()
136                 Option = continue | start | {procs, PidList} |  TraceStartOp‐
137                 tion
138
139              Calls erlang:apply(Func, Args) surrounded by trace([start, ...])
140              and trace(stop).
141
142              Some effort is made to keep the  trace  clean  from  unnecessary
143              trace  messages;  tracing  is started and stopped from a spawned
144              process while the erlang:apply/2 call is  made  in  the  current
145              process,  only surrounded by receive and send statements towards
146              the trace starting process. The  trace  starting  process  exits
147              when not needed any more.
148
149              The  TraceStartOption  is  any  option  allowed for trace/1. The
150              options [start, {procs, [self() | PidList]} | OptList] are given
151              to trace/1, where OptList is OptionList with continue, start and
152              {procs, _} options removed.
153
154              The continue option inhibits the call to trace(stop) and  leaves
155              it up to the caller to stop tracing at a suitable time.
156
157       apply(Module, Function, Args, OptionList) -> term()
158
159              Types:
160
161                 Module = atom()
162                 Function = atom()
163                 Args = [term()]
164
165              Same as apply({Module, Function}, Args, OptionList).
166
167              OptionList is an option list allowed for apply/3.
168
169       trace(start,  Filename)  ->  ok | {error, Reason} | {'EXIT', ServerPid,
170       Reason}
171
172              Types:
173
174                 Reason = term()
175
176              Same as trace([start, {file, Filename}]).
177
178       trace(verbose, Filename) -> ok | {error, Reason} | {'EXIT',  ServerPid,
179       Reason}
180
181              Types:
182
183                 Reason = term()
184
185              Same as trace([start, verbose, {file, Filename}]).
186
187       trace(OptionName,  OptionValue)  ->  ok  |  {error,  Reason} | {'EXIT',
188       ServerPid, Reason}
189
190              Types:
191
192                 OptionName = atom()
193                 OptionValue = term()
194                 Reason = term()
195
196              Same as trace([{OptionName, OptionValue}]).
197
198       trace(verbose) -> ok | {error, Reason} | {'EXIT', ServerPid, Reason}
199
200              Types:
201
202                 Reason = term()
203
204              Same as trace([start, verbose]).
205
206       trace(OptionName) -> ok | {error, Reason} | {'EXIT', ServerPid, Reason}
207
208              Types:
209
210                 OptionName = atom()
211                 Reason = term()
212
213              Same as trace([OptionName]).
214
215       trace({OptionName, OptionValue}) -> ok |  {error,  Reason}  |  {'EXIT',
216       ServerPid, Reason}
217
218              Types:
219
220                 OptionName = atom()
221                 OptionValue = term()
222                 Reason = term()
223
224              Same as trace([{OptionName, OptionValue}]).
225
226       trace([Option]) -> ok | {error, Reason} | {'EXIT', ServerPid, Reason}
227
228              Types:
229
230                 Option = start | stop | {procs, PidSpec} | {procs, [PidSpec]}
231                 | verbose | {verbose, bool()} | file  |  {file,  Filename}  |
232                 {tracer, Tracer}
233                 PidSpec = pid() | atom()
234                 Tracer = pid() | port()
235                 Reason = term()
236
237              Starts or stops tracing.
238
239              PidSpec  and  Tracer  are used in calls to erlang:trace(PidSpec,
240              true, [{tracer, Tracer} | Flags]), and Filename is used to  call
241              dbg:trace_port(file, Filename). Please see the appropriate docu‐
242              mentation.
243
244              Option description:
245
246                stop:
247                  Stops a running fprof trace and clears all tracing from  the
248                  node. Either option stop or start must be specified, but not
249                  both.
250
251                start:
252                  Clears all tracing from the node  and  starts  a  new  fprof
253                  trace.  Either  option  start or stop must be specified, but
254                  not both.
255
256                verbose| {verbose, bool()}:
257                  The options verbose or {verbose, true} adds some trace flags
258                  that  fprof  does  not need, but that may be interesting for
259                  general debugging purposes. This option is only allowed with
260                  the start option.
261
262                cpu_time| {cpu_time, bool()}:
263                  The  options  cpu_time  or  {cpu_time, true} makes the time‐
264                  stamps in the trace be in CPU time instead of wallclock time
265                  which  is  the default. This option is only allowed with the
266                  start option.
267
268            Warning:
269                Getting correct values out of cpu_time can be  difficult.  The
270                best way to get correct values is to run using a single sched‐
271                uler and bind that scheduler to a specific CPU, i.e. erl +S  1
272                +sbt db.
273
274
275                {procs, PidSpec}| {procs, [PidSpec]}:
276                  Specifies  which  processes  that  shall  be traced. If this
277                  option is not given, the calling process is traced. All pro‐
278                  cesses spawned by the traced processes are also traced. This
279                  option is only allowed with the start option.
280
281                file| {file, Filename}:
282                  Specifies the filename of the trace. If the option  file  is
283                  given,  or  none  of  these  options  are  given,  the  file
284                  "fprof.trace" is used. This option is only allowed with  the
285                  start option, but not with the {tracer, Tracer} option.
286
287                {tracer, Tracer}:
288                  Specifies  that  trace  to  process  or  port  shall be done
289                  instead of trace to file. This option is only  allowed  with
290                  the start option, but not with the {file, Filename} option.
291
292       profile() -> ok | {error, Reason} | {'EXIT', ServerPid, Reason}
293
294              Types:
295
296                 Reason = term()
297
298              Same as profile([]).
299
300       profile(OptionName,  OptionValue)  ->  ok  | {error, Reason} | {'EXIT',
301       ServerPid, Reason}
302
303              Types:
304
305                 OptionName = atom()
306                 OptionValue = term()
307                 Reason = term()
308
309              Same as profile([{OptionName, OptionValue}]).
310
311       profile(OptionName) -> ok | {error, Reason} | {'EXIT', ServerPid,  Rea‐
312       son}
313
314              Types:
315
316                 OptionName = atom()
317                 Reason = term()
318
319              Same as profile([OptionName]).
320
321       profile({OptionName,  OptionValue})  -> ok | {error, Reason} | {'EXIT',
322       ServerPid, Reason}
323
324              Types:
325
326                 OptionName = atom()
327                 OptionValue = term()
328                 Reason = term()
329
330              Same as profile([{OptionName, OptionValue}]).
331
332       profile([Option]) -> ok | {ok, Tracer} |  {error,  Reason}  |  {'EXIT',
333       ServerPid, Reason}
334
335              Types:
336
337                 Option  =  file  |  {file,  Filename} | dump | {dump, Dump} |
338                 append | start | stop
339                 Dump = pid() | Dumpfile | []
340                 Tracer = pid()
341                 Reason = term()
342
343              Compiles a trace into raw profile data held by the fprof server.
344
345              Dumpfile is used to call file:open/2, and Filename  is  used  to
346              call  dbg:trace_port(file, Filename). Please see the appropriate
347              documentation.
348
349              Option description:
350
351                file| {file, Filename}:
352                  Reads the file Filename and creates raw profile data that is
353                  stored  in  RAM  by  the fprof server. If the option file is
354                  given,  or  none  of  these  options  are  given,  the  file
355                  "fprof.trace"  is  read. The call will return when the whole
356                  trace has been read with the return value ok if  successful.
357                  This option is not allowed with the start or stop options.
358
359                dump| {dump, Dump}:
360                  Specifies  the  destination for the trace text dump. If this
361                  option is not given, no dump is generated, if it is dump the
362                  destination will be the caller's group leader, otherwise the
363                  destination Dump is either the pid of an  I/O  device  or  a
364                  filename. And, finally, if the filename is [] - "fprof.dump"
365                  is used instead. This option is not allowed  with  the  stop
366                  option.
367
368                append:
369                  Causes the trace text dump to be appended to the destination
370                  file. This option is only allowed with the {dump,  Dumpfile}
371                  option.
372
373                start:
374                  Starts a tracer process that profiles trace data in runtime.
375                  The call will return immediately with the return value  {ok,
376                  Tracer}  if  successful. This option is not allowed with the
377                  stop, file or {file, Filename} options.
378
379                stop:
380                  Stops the tracer process that profiles trace  data  in  run‐
381                  time.  The return value will be value ok if successful. This
382                  option is not allowed with the start, file or  {file,  File‐
383                  name} options.
384
385       analyse() -> ok | {error, Reason} | {'EXIT', ServerPid, Reason}
386
387              Types:
388
389                 Reason = term()
390
391              Same as analyse([]).
392
393       analyse(OptionName,  OptionValue)  ->  ok  | {error, Reason} | {'EXIT',
394       ServerPid, Reason}
395
396              Types:
397
398                 OptionName = atom()
399                 OptionValue = term()
400                 Reason = term()
401
402              Same as analyse([{OptionName, OptionValue}]).
403
404       analyse(OptionName) -> ok | {error, Reason} | {'EXIT', ServerPid,  Rea‐
405       son}
406
407              Types:
408
409                 OptionName = atom()
410                 Reason = term()
411
412              Same as analyse([OptionName]).
413
414       analyse({OptionName,  OptionValue})  -> ok | {error, Reason} | {'EXIT',
415       ServerPid, Reason}
416
417              Types:
418
419                 OptionName = atom()
420                 OptionValue = term()
421                 Reason = term()
422
423              Same as analyse([{OptionName, OptionValue}]).
424
425       analyse([Option]) -> ok | {error, Reason} | {'EXIT', ServerPid, Reason}
426
427              Types:
428
429                 Option = dest | {dest, Dest} | append | {cols, Cols} |  call‐
430                 ers  |  {callers,  bool()}  | no_callers | {sort, SortSpec} |
431                 totals | {totals, bool()} | details  |  {details,  bool()}  |
432                 no_details
433                 Dest = pid() | Destfile
434                 Cols = integer() >= 80
435                 SortSpec = acc | own
436                 Reason = term()
437
438              Analyses  raw  profile data in the fprof server. If called while
439              there is no raw profile data available, {error,  no_profile}  is
440              returned.
441
442              Destfile is used to call file:open/2. Please see the appropriate
443              documentation.
444
445              Option description:
446
447                dest| {dest, Dest}:
448                  Specifies the destination for the analysis. If  this  option
449                  is  not  given  or  it  is dest, the destination will be the
450                  caller's group leader, otherwise  the  destination  Dest  is
451                  either  the  pid()  of  an  I/O  device  or a filename. And,
452                  finally, if the filename is [] -  "fprof.analysis"  is  used
453                  instead.
454
455                append:
456                  Causes  the analysis to be appended to the destination file.
457                  This option  is  only  allowed  with  the  {dest,  Destfile}
458                  option.
459
460                {cols, Cols}:
461                  Specifies  the  number  of  columns in the analysis text. If
462                  this option is not given the number of columns is set to 80.
463
464                callers| {callers, true}:
465                  Prints callers and called information in the analysis.  This
466                  is the default.
467
468                {callers, false}| no_callers:
469                  Suppresses the printing of callers and called information in
470                  the analysis.
471
472                {sort, SortSpec}:
473                  Specifies if the analysis should be sorted according to  the
474                  ACC  column,  which  is  the default, or the OWN column. See
475                  Analysis Format below.
476
477                totals| {totals, true}:
478                  Includes a section containing call statistics for all  calls
479                  regardless of process, in the analysis.
480
481                {totals, false}:
482                  Supresses  the  totals section in the analysis, which is the
483                  default.
484
485                details| {details, true}:
486                  Prints call statistics for each  process  in  the  analysis.
487                  This is the default.
488
489                {details, false}| no_details:
490                  Suppresses  the  call  statistics  for each process from the
491                  analysis.
492

ANALYSIS FORMAT

494       This section describes the output format of the  analyse  command.  See
495       analyse/0.
496
497       The  format is parsable with the standard Erlang parsing tools erl_scan
498       and erl_parse, file:consult/1 or io:read/2. The  parse  format  is  not
499       explained  here  -  it should be easy for the interested to try it out.
500       Note that some flags to analyse/1 will affect the format.
501
502       The following example was run on OTP/R8 on Solaris 8, all OTP internals
503       in this example are very version dependent.
504
505       As  an example, we will use the following function, that you may recog‐
506       nise as  a  slightly  modified  benchmark  function  from  the  manpage
507       file(3):
508
509       -module(foo).
510       -export([create_file_slow/2]).
511
512       create_file_slow(Name, N) when integer(N), N >= 0 ->
513           {ok, FD} =
514               file:open(Name, [raw, write, delayed_write, binary]),
515           if N > 256 ->
516                   ok = file:write(FD,
517                                   lists:map(fun (X) -> <<X:32/unsigned>> end,
518                                   lists:seq(0, 255))),
519                   ok = create_file_slow(FD, 256, N);
520              true ->
521                   ok = create_file_slow(FD, 0, N)
522           end,
523           ok = file:close(FD).
524
525       create_file_slow(FD, M, M) ->
526           ok;
527       create_file_slow(FD, M, N) ->
528           ok = file:write(FD, <<M:32/unsigned>>),
529           create_file_slow(FD, M+1, N).
530
531       Let us have a look at the printout after running:
532
533       1> fprof:apply(foo, create_file_slow, [junk, 1024]).
534       2> fprof:profile().
535       3> fprof:analyse().
536
537       The printout starts with:
538
539       %% Analysis results:
540       {  analysis_options,
541        [{callers, true},
542         {sort, acc},
543         {totals, false},
544         {details, true}]}.
545
546       %                                       CNT       ACC       OWN
547       [{ totals,                             9627, 1691.119, 1659.074}].  %%%
548
549       The  CNT column shows the total number of function calls that was found
550       in the trace. In the ACC column is the total time  of  the  trace  from
551       first timestamp to last. And in the OWN column is the sum of the execu‐
552       tion time in functions found in the trace, not including  called  func‐
553       tions. In this case it is very close to the ACC time since the emulator
554       had practically nothing else to do than to execute our test program.
555
556       All time values in the printout are in milliseconds.
557
558       The printout continues:
559
560       %                                       CNT       ACC       OWN
561       [{ "<0.28.0>",                         9627,undefined, 1659.074}].   %%
562
563       This is the printout header of one process. The printout contains  only
564       this  one process since we did fprof:apply/3 which traces only the cur‐
565       rent process. Therefore the CNT and OWN columns perfectly  matches  the
566       totals  above.  The ACC column is undefined since summing the ACC times
567       of all calls in the process makes no sense - you  would  get  something
568       like the ACC value from totals above multiplied by the average depth of
569       the call stack, or something.
570
571       All paragraphs up to the next process  header  only  concerns  function
572       calls within this process.
573
574       Now we come to something more interesting:
575
576       {[{undefined,                             0, 1691.076,    0.030}],
577        { {fprof,apply_start_stop,4},            0, 1691.076,    0.030},     %
578        [{{foo,create_file_slow,2},              1, 1691.046,    0.103},
579         {suspend,                               1,    0.000,    0.000}]}.
580
581       {[{{fprof,apply_start_stop,4},            1, 1691.046,    0.103}],
582        { {foo,create_file_slow,2},              1, 1691.046,    0.103},     %
583        [{{file,close,1},                        1, 1398.873,    0.019},
584         {{foo,create_file_slow,3},              1,  249.678,    0.029},
585         {{file,open,2},                         1,   20.778,    0.055},
586         {{lists,map,2},                         1,   16.590,    0.043},
587         {{lists,seq,2},                         1,    4.708,    0.017},
588         {{file,write,2},                        1,    0.316,    0.021}]}.
589
590       The  printout  consists of one paragraph per called function. The func‐
591       tion marked with '%' is the  one  the  paragraph  concerns  -  foo:cre‐
592       ate_file_slow/2.  Above the marked function are the calling functions -
593       those that has called the marked, and below are  those  called  by  the
594       marked function.
595
596       The  paragraphs  are  per default sorted in decreasing order of the ACC
597       column for the marked function. The calling list and called list within
598       one  paragraph are also per default sorted in decreasing order of their
599       ACC column.
600
601       The columns are: CNT - the  number  of  times  the  function  has  been
602       called,  ACC  -  the  time spent in the function including called func‐
603       tions, and OWN - the time spent in the function  not  including  called
604       functions.
605
606       The  rows  for  the calling functions contain statistics for the marked
607       function with the constraint that only the occasions when  a  call  was
608       made from the row's function to the marked function are accounted for.
609
610       The  row for the marked function simply contains the sum of all calling
611       rows.
612
613       The rows for the called functions contains  statistics  for  the  row's
614       function  with  the  constraint that only the occasions when a call was
615       made from the marked to the row's function are accounted for.
616
617       So, we see that foo:create_file_slow/2 used very little  time  for  its
618       own  execution. It spent most of its time in file:close/1. The function
619       foo:create_file_slow/3 that writes 3/4 of the file contents is the sec‐
620       ond biggest time thief.
621
622       We  also  see that the call to file:write/2 that writes 1/4 of the file
623       contents takes very little time in itself. What takes time is to  build
624       the data (lists:seq/2 and lists:map/2).
625
626       The function 'undefined' that has called fprof:apply_start_stop/4 is an
627       unknown function because that call was not recorded in  the  trace.  It
628       was    only    recorded    that    the    execution    returned    from
629       fprof:apply_start_stop/4 to some  other  function  above  in  the  call
630       stack, or that the process exited from there.
631
632       Let us continue down the printout to find:
633
634       {[{{foo,create_file_slow,2},              1,  249.678,    0.029},
635         {{foo,create_file_slow,3},            768,    0.000,   23.294}],
636        { {foo,create_file_slow,3},            769,  249.678,   23.323},     %
637        [{{file,write,2},                      768,  220.314,   14.539},
638         {suspend,                              57,    6.041,    0.000},
639         {{foo,create_file_slow,3},            768,    0.000,   23.294}]}.
640
641       If  you  compare  with  the  code you will see there also that foo:cre‐
642       ate_file_slow/3 was called only from foo:create_file_slow/2 and itself,
643       and called only file:write/2, note the number of calls to file:write/2.
644       But here we see that suspend was called a few times. This is  a  pseudo
645       function  that indicates that the process was suspended while executing
646       in  foo:create_file_slow/3,  and  since  there   is   no   receive   or
647       erlang:yield/0  in  the code, it must be Erlang scheduling suspensions,
648       or the trace file driver compensating for large file  write  operations
649       (these  are regarded as a schedule out followed by a schedule in to the
650       same process).
651
652       Let us find the suspend entry:
653
654       {[{{file,write,2},                       53,    6.281,    0.000},
655         {{foo,create_file_slow,3},             57,    6.041,    0.000},
656         {{prim_file,drv_command,4},            50,    4.582,    0.000},
657         {{prim_file,drv_get_response,1},       34,    2.986,    0.000},
658         {{lists,map,2},                        10,    2.104,    0.000},
659         {{prim_file,write,2},                  17,    1.852,    0.000},
660         {{erlang,port_command,2},              15,    1.713,    0.000},
661         {{prim_file,drv_command,2},            22,    1.482,    0.000},
662         {{prim_file,translate_response,2},     11,    1.441,    0.000},
663         {{prim_file,'-drv_command/2-fun-0-',1},  15,    1.340,    0.000},
664         {{lists,seq,4},                         3,    0.880,    0.000},
665         {{foo,'-create_file_slow/2-fun-0-',1},   5,    0.523,    0.000},
666         {{erlang,bump_reductions,1},            4,    0.503,    0.000},
667         {{prim_file,open_int_setopts,3},        1,    0.165,    0.000},
668         {{prim_file,i32,4},                     1,    0.109,    0.000},
669         {{fprof,apply_start_stop,4},            1,    0.000,    0.000}],
670        { suspend,                             299,   32.002,    0.000},     %
671        [ ]}.
672
673       We find no particulary long suspend times, so no function seems to have
674       waited  in  a receive statement. Actually, prim_file:drv_command/4 con‐
675       tains a receive statement, but in this test program, the  message  lies
676       in the process receive buffer when the receive statement is entered. We
677       also see that the total suspend time for the test run is small.
678
679       The suspend pseudo function has got an OWN time of  zero.  This  is  to
680       prevent  the  process total OWN time from including time in suspension.
681       Whether suspend time is really ACC or OWN time is more of a philosophi‐
682       cal question.
683
684       Now we look at another interesting pseudo function, garbage_collect:
685
686       {[{{prim_file,drv_command,4},            25,    0.873,    0.873},
687         {{prim_file,write,2},                  16,    0.692,    0.692},
688         {{lists,map,2},                         2,    0.195,    0.195}],
689        { garbage_collect,                      43,    1.760,    1.760},     %
690        [ ]}.
691
692       Here  we  see that no function distinguishes itself considerably, which
693       is very normal.
694
695       The garbage_collect pseudo function has not got an  OWN  time  of  zero
696       like suspend, instead it is equal to the ACC time.
697
698       Garbage  collect  often  occurs while a process is suspended, but fprof
699       hides this fact by pretending that the  suspended  function  was  first
700       unsuspended  and  then  garbage collected. Otherwise the printout would
701       show garbage_collect being called from suspend but not  which  function
702       that might have caused the garbage collection.
703
704       Let us now get back to the test code:
705
706       {[{{foo,create_file_slow,3},            768,  220.314,   14.539},
707         {{foo,create_file_slow,2},              1,    0.316,    0.021}],
708        { {file,write,2},                      769,  220.630,   14.560},     %
709        [{{prim_file,write,2},                 769,  199.789,   22.573},
710         {suspend,                              53,    6.281,    0.000}]}.
711
712       Not  unexpectedly,  we  see  that file:write/2 was called from foo:cre‐
713       ate_file_slow/3 and foo:create_file_slow/2. The number of calls in each
714       case  as  well  as  the  used  time are also just confirms the previous
715       results.
716
717       We see that file:write/2  only  calls  prim_file:write/2,  but  let  us
718       refrain from digging into the internals of the kernel application.
719
720       But,  if  we nevertheless do dig down we find the call to the linked in
721       driver that does the file operations towards the host operating system:
722
723       {[{{prim_file,drv_command,4},           772, 1458.356, 1456.643}],
724        { {erlang,port_command,2},             772, 1458.356, 1456.643},     %
725        [{suspend,                              15,    1.713,    0.000}]}.
726
727       This is 86 % of the total run time, and as we  saw  before  it  is  the
728       close  operation the absolutely biggest contributor. We find a compari‐
729       son ratio a little bit up in the call stack:
730
731       {[{{prim_file,close,1},                   1, 1398.748,    0.024},
732         {{prim_file,write,2},                 769,  174.672,   12.810},
733         {{prim_file,open_int,4},                1,   19.755,    0.017},
734         {{prim_file,open_int_setopts,3},        1,    0.147,    0.016}],
735        { {prim_file,drv_command,2},           772, 1593.322,   12.867},     %
736        [{{prim_file,drv_command,4},           772, 1578.973,   27.265},
737         {suspend,                              22,    1.482,    0.000}]}.
738
739       The time for file operations in the linked in driver distributes itself
740       as  1 % for open, 11 % for write and 87 % for close. All data is proba‐
741       bly buffered in the operating system until the close.
742
743       The  unsleeping  reader   may   notice   that   the   ACC   times   for
744       prim_file:drv_command/2   and   prim_file:drv_command/4  is  not  equal
745       between the paragraphs above, even though it is easy  to  believe  that
746       prim_file:drv_command/2 is just a passthrough function.
747
748       The  missing  time can be found in the paragraph for prim_file:drv_com‐
749       mand/4 where it is evident that  not  only  prim_file:drv_command/2  is
750       called but also a fun:
751
752       {[{{prim_file,drv_command,2},           772, 1578.973,   27.265}],
753        { {prim_file,drv_command,4},           772, 1578.973,   27.265},     %
754        [{{erlang,port_command,2},             772, 1458.356, 1456.643},
755         {{prim_file,'-drv_command/2-fun-0-',1}, 772,   87.897,   12.736},
756         {suspend,                              50,    4.582,    0.000},
757         {garbage_collect,                      25,    0.873,    0.873}]}.
758
759       And  some  more  missing  time  can  be  explained  by  the  fact  that
760       prim_file:open_int/4 both  calls  prim_file:drv_command/2  directly  as
761       well  as  through  prim_file:open_int_setopts/3,  which complicates the
762       picture.
763
764       {[{{prim_file,open,2},                    1,   20.309,    0.029},
765         {{prim_file,open_int,4},                1,    0.000,    0.057}],
766        { {prim_file,open_int,4},                2,   20.309,    0.086},     %
767        [{{prim_file,drv_command,2},             1,   19.755,    0.017},
768         {{prim_file,open_int_setopts,3},        1,    0.360,    0.032},
769         {{prim_file,drv_open,2},                1,    0.071,    0.030},
770         {{erlang,list_to_binary,1},             1,    0.020,    0.020},
771         {{prim_file,i32,1},                     1,    0.017,    0.017},
772         {{prim_file,open_int,4},                1,    0.000,    0.057}]}.
773       {[{{prim_file,open_int,4},                1,    0.360,    0.032},
774         {{prim_file,open_int_setopts,3},        1,    0.000,    0.016}],
775        { {prim_file,open_int_setopts,3},        2,    0.360,    0.048},     %
776        [{suspend,                               1,    0.165,    0.000},
777         {{prim_file,drv_command,2},             1,    0.147,    0.016},
778         {{prim_file,open_int_setopts,3},        1,    0.000,    0.016}]}.
779

NOTES

781       The actual supervision of execution times is in itself a CPU  intensive
782       activity.  A  message  is  written on the trace file for every function
783       call that is made by the profiled code.
784
785       The ACC time calculation is sometimes difficult to make correct,  since
786       it  is  difficult  to  define.  This happens especially when a function
787       occurs in several instances in the call stack, for example  by  calling
788       itself perhaps through other functions and perhaps even non-tail recur‐
789       sively.
790
791       To produce sensible results, fprof tries not  to  charge  any  function
792       more  than  once  for  ACC  time. The instance highest up (with longest
793       duration) in the call stack is chosen.
794
795       Sometimes a function may unexpectedly waste a lot (some 10 ms  or  more
796       depending  on  host  machine  OS) of OWN (and ACC) time, even functions
797       that does practically nothing at all. The problem may be  that  the  OS
798       has  chosen  to  schedule  out  the Erlang runtime system process for a
799       while, and if the OS does not support high resolution cpu time measure‐
800       ments  fprof  will use wallclock time for its calculations, and it will
801       appear as functions randomly burn virtual machine time.
802

SEE ALSO

804       dbg(3), eprof(3), erlang(3), io(3), Tools User's Guide
805
806
807
808Ericsson AB                      tools 3.1.0.1                        fprof(3)
Impressum