1fprof(3)                   Erlang Module Definition                   fprof(3)
2
3
4

NAME

6       fprof  -  A Time Profiling Tool using trace to file for minimal runtime
7       performance impact.
8

DESCRIPTION

10       This module is used to profile a program to find out how the  execution
11       time is used. Trace to file is used to minimize runtime performance im‐
12       pact.
13
14       The fprof module uses tracing to collect profiling data, hence there is
15       no  need  for special compilation of any module to be profiled. When it
16       starts tracing, fprof will erase all previous tracing in the  node  and
17       set the necessary trace flags on the profiling target processes as well
18       as local call trace on all functions in all loaded modules and all mod‐
19       ules  to  be loaded. fprof erases all tracing in the node when it stops
20       tracing.
21
22       fprof presents both own time i.e how much time a function has used  for
23       its own execution, and accumulated time i.e including called functions.
24       All presented times are collected using trace timestamps.  fprof  tries
25       to  collect  cpu  time  timestamps, if the host machine OS supports it.
26       Therefore the times may be wallclock times and OS scheduling will  ran‐
27       domly strike all called functions in a presumably fair way.
28
29       If,  however, the profiling time is short, and the host machine OS does
30       not support high resolution cpu time measurements, some few OS schedul‐
31       ings may show up as ridiculously long execution times for functions do‐
32       ing practically nothing. An example of a function  more  or  less  just
33       composing a tuple in about 100 times the normal execution time has been
34       seen, and when the tracing was repeated, the execution time became nor‐
35       mal.
36
37       Profiling is essentially done in 3 steps:
38
39         1:
40           Tracing; to file, as mentioned in the previous paragraph. The trace
41           contains entries for function calls, returns to  function,  process
42           scheduling,  other process related (spawn, etc) events, and garbage
43           collection. All trace entries are timestamped.
44
45         2:
46           Profiling; the trace file is read, the execution call stack is sim‐
47           ulated,  and raw profile data is calculated from the simulated call
48           stack and the trace timestamps. The profile data is stored  in  the
49           fprof  server  state. During this step the trace data may be dumped
50           in text format to file or console.
51
52         3:
53           Analysing; the raw profile data is sorted, filtered and  dumped  in
54           text  format either to file or console. The text format intended to
55           be both readable for a human reader, as well as parsable  with  the
56           standard erlang parsing tools.
57
58       Since  fprof uses trace to file, the runtime performance degradation is
59       minimized, but still far from negligible, especially for programs  that
60       use  the  filesystem  heavily  by themselves. Where you place the trace
61       file is also important, e.g on Solaris /tmp is usually  a  good  choice
62       since  it  is  essentially  a RAM disk, while any NFS (network) mounted
63       disk is a bad idea.
64
65       fprof can also skip the file step and trace to a  tracer  process  that
66       does the profiling in runtime.
67

EXPORTS

69       start() -> {ok, Pid} | {error, {already_started, Pid}}
70
71              Types:
72
73                 Pid = pid()
74
75              Starts the fprof server.
76
77              Note  that  it seldom needs to be started explicitly since it is
78              automatically started by  the  functions  that  need  a  running
79              server.
80
81       stop() -> ok
82
83              Same as stop(normal).
84
85       stop(Reason) -> ok
86
87              Types:
88
89                 Reason = term()
90
91              Stops the fprof server.
92
93              The  supplied  Reason  becomes  the  exit  reason for the server
94              process. Default Any Reason other than kill sends a  request  to
95              the server and waits for it to clean up, reply and exit. If Rea‐
96              son is kill, the server is bluntly killed.
97
98              If the fprof server is not running, this function returns  imme‐
99              diately with the same return value.
100
101          Note:
102              When  the fprof server is stopped the collected raw profile data
103              is lost.
104
105
106       apply(Func, Args) -> term()
107
108              Types:
109
110                 Func = function() | {Module :: module(), Function :: atom()}
111                 Args = [term()]
112
113              Same as apply(Func, Args, []).
114
115       apply(Module, Function, Args) -> term()
116
117              Types:
118
119                 Module = module()
120                 Function = atom()
121                 Args = [term()]
122
123              Same as apply({Module, Function}, Args, []).
124
125       apply(Func, Args, OptionList) -> term()
126
127              Types:
128
129                 Func = function() | {Module :: module(), Function :: atom()}
130                 Args = [term()]
131                 OptionList = [Option]
132                 Option = apply_option()
133                 apply_option() =
134                     continue |
135                     {procs, PidList :: [pid()]} |
136                     start |
137                     (TraceStartOption :: trace_option())
138                 trace_option() =
139                     cpu_time |
140                     {cpu_time, boolean()} |
141                     file |
142                     {file, Filename :: file:filename()} |
143                     {procs, PidSpec :: pid_spec()} |
144                     {procs, [PidSpec :: pid_spec()]} |
145                     start | stop |
146                     {tracer, Tracer :: pid() | port()} |
147                     verbose |
148                     {verbose, boolean()}
149                 pid_spec() = pid() | atom()
150
151              Calls erlang:apply(Func, Args) surrounded by trace([start, ...])
152              and trace(stop).
153
154              Some  effort  is  made  to keep the trace clean from unnecessary
155              trace messages; tracing is started and stopped  from  a  spawned
156              process  while  the  erlang:apply/2  call is made in the current
157              process, only surrounded by receive and send statements  towards
158              the  trace  starting  process.  The trace starting process exits
159              when not needed any more.
160
161              The TraceStartOption is any option allowed for trace/1. The  op‐
162              tions  [start,  {procs, [self() | PidList]} | OptList] are given
163              to trace/1, where OptList is OptionList with continue, start and
164              {procs, _} options removed.
165
166              The  continue option inhibits the call to trace(stop) and leaves
167              it up to the caller to stop tracing at a suitable time.
168
169       apply(Module, Function, Args, OptionList) -> term()
170
171              Types:
172
173                 Module = module()
174                 Function = atom()
175                 Args = [term()]
176                 OptionList = [Option]
177                 Option = apply_option()
178                 apply_option() =
179                     continue |
180                     {procs, PidList :: [pid()]} |
181                     start |
182                     (TraceStartOption :: trace_option())
183                 trace_option() =
184                     cpu_time |
185                     {cpu_time, boolean()} |
186                     file |
187                     {file, Filename :: file:filename()} |
188                     {procs, PidSpec :: pid_spec()} |
189                     {procs, [PidSpec :: pid_spec()]} |
190                     start | stop |
191                     {tracer, Tracer :: pid() | port()} |
192                     verbose |
193                     {verbose, boolean()}
194                 pid_spec() = pid() | atom()
195
196              Same as apply({Module, Function}, Args, OptionList).
197
198              OptionList is an option list allowed for apply/3.
199
200       trace(OptionName :: start, Filename) ->
201                ok | {error, Reason} | {'EXIT', ServerPid, Reason}
202
203              Types:
204
205                 Filename = file:filename()
206                 ServerPid = pid()
207                 Reason = term()
208
209              Same as trace([start, {file, Filename}]).
210
211       trace(OptionName :: verbose, Filename) ->
212                ok | {error, Reason} | {'EXIT', ServerPid, Reason}
213
214              Types:
215
216                 Filename = file:filename()
217                 ServerPid = pid()
218                 Reason = term()
219
220              Same as trace([start, verbose, {file, Filename}]).
221
222       trace(OptionName, OptionValue) ->
223                ok | {error, Reason} | {'EXIT', ServerPid, Reason}
224
225              Types:
226
227                 OptionName = atom()
228                 OptionValue = term()
229                 ServerPid = pid()
230                 Reason = term()
231
232              Same as trace([{OptionName, OptionValue}]).
233
234       trace(Option :: verbose) ->
235                ok | {error, Reason} | {'EXIT', ServerPid, Reason}
236
237              Types:
238
239                 ServerPid = pid()
240                 Reason = term()
241
242              Same as trace([start, verbose]).
243
244       trace(OptionName) ->
245                ok | {error, Reason} | {'EXIT', ServerPid, Reason}
246
247              Types:
248
249                 OptionName = atom()
250                 ServerPid = pid()
251                 Reason = term()
252
253              Same as trace([OptionName]).
254
255       trace(Option :: {OptionName, OptionValue}) ->
256                ok | {error, Reason} | {'EXIT', ServerPid, Reason}
257
258              Types:
259
260                 OptionName = atom()
261                 OptionValue = term()
262                 ServerPid = pid()
263                 Reason = term()
264
265              Same as trace([{OptionName, OptionValue}]).
266
267       trace(OptionList) ->
268                ok | {error, Reason} | {'EXIT', ServerPid, Reason}
269
270              Types:
271
272                 OptionList = [Option]
273                 Option = trace_option()
274                 ServerPid = pid()
275                 Reason = term()
276                 trace_option() =
277                     cpu_time |
278                     {cpu_time, boolean()} |
279                     file |
280                     {file, Filename :: file:filename()} |
281                     {procs, PidSpec :: pid_spec()} |
282                     {procs, [PidSpec :: pid_spec()]} |
283                     start | stop |
284                     {tracer, Tracer :: pid() | port()} |
285                     verbose |
286                     {verbose, boolean()}
287                 pid_spec() = pid() | atom()
288
289              Starts or stops tracing.
290
291              PidSpec and Tracer are used in  calls  to  erlang:trace(PidSpec,
292              true,  [{tracer, Tracer} | Flags]), and Filename is used to call
293              dbg:trace_port(file, Filename). Please  see  erlang:trace/3  and
294              dbg:trace_port/2.
295
296              Option description:
297
298                stop:
299                  Stops  a running fprof trace and clears all tracing from the
300                  node. Either option stop or start must be specified, but not
301                  both.
302
303                start:
304                  Clears  all  tracing  from  the  node and starts a new fprof
305                  trace. Either option start or stop must  be  specified,  but
306                  not both.
307
308                verbose | {verbose, boolean()}:
309                  The options verbose or {verbose, true} adds some trace flags
310                  that fprof does not need, but that may  be  interesting  for
311                  general debugging purposes. This option is only allowed with
312                  the start option.
313
314                cpu_time | {cpu_time, boolean()}:
315                  The options cpu_time or {cpu_time,  true}  makes  the  time‐
316                  stamps in the trace be in CPU time instead of wallclock time
317                  which is the default. This option is only allowed  with  the
318                  start option.
319
320            Warning:
321                Getting  correct  values out of cpu_time can be difficult. The
322                best way to get correct values is to run using a single sched‐
323                uler  and bind that scheduler to a specific CPU, i.e. erl +S 1
324                +sbt db.
325
326
327                {procs, PidSpec} | {procs, [PidSpec]}:
328                  Specifies which processes that shall be traced. If this  op‐
329                  tion  is  not given, the calling process is traced. All pro‐
330                  cesses spawned by the traced processes are also traced. This
331                  option is only allowed with the start option.
332
333                file | {file, Filename}:
334                  Specifies  the  filename of the trace. If the option file is
335                  given,  or  none  of  these  options  are  given,  the  file
336                  "fprof.trace"  is used. This option is only allowed with the
337                  start option, but not with the {tracer, Tracer} option.
338
339                {tracer, Tracer}:
340                  Specifies that trace to process or port shall  be  done  in‐
341                  stead of trace to file. This option is only allowed with the
342                  start option, but not with the {file, Filename} option.
343
344       profile() -> ok | {error, Reason} | {'EXIT', ServerPid, Reason}
345
346              Types:
347
348                 ServerPid = pid()
349                 Reason = term()
350
351              Same as profile([]).
352
353       profile(OptionName, OptionValue) ->
354                  ok |
355                  {ok, Tracer} |
356                  {error, Reason} |
357                  {'EXIT', ServerPid, Reason}
358
359              Types:
360
361                 OptionName = atom()
362                 OptionValue = term()
363                 Tracer = ServerPid = pid()
364                 Reason = term()
365
366              Same as profile([{OptionName, OptionValue}]).
367
368       profile(OptionName) ->
369                  ok |
370                  {ok, Tracer} |
371                  {error, Reason} |
372                  {'EXIT', ServerPid, Reason}
373
374              Types:
375
376                 OptionName = atom()
377                 Tracer = ServerPid = pid()
378                 Reason = term()
379
380              Same as profile([OptionName]).
381
382       profile(Option :: {OptionName, OptionValue}) ->
383                  ok |
384                  {ok, Tracer} |
385                  {error, Reason} |
386                  {'EXIT', ServerPid, Reason}
387
388              Types:
389
390                 OptionName = atom()
391                 OptionValue = term()
392                 Tracer = ServerPid = pid()
393                 Reason = term()
394
395              Same as profile([{OptionName, OptionValue}]).
396
397       profile(OptionList) ->
398                  ok |
399                  {ok, Tracer} |
400                  {error, Reason} |
401                  {'EXIT', ServerPid, Reason}
402
403              Types:
404
405                 OptionList = [Option]
406                 Option = profile_option()
407                 Tracer = ServerPid = pid()
408                 Reason = term()
409                 profile_option() =
410                     append | dump |
411                     {dump, pid() | (Dump :: (Dumpfile :: file:filename() | []))} |
412                     file |
413                     {file, Filename :: file:filename()} |
414                     start | stop
415
416              Compiles a trace into raw profile data held by the fprof server.
417
418              Dumpfile is used to call file:open/2, and Filename  is  used  to
419              call  dbg:trace_port(file, Filename). Please see file:open/2 and
420              dbg:trace_port/2.
421
422              Option description:
423
424                file | {file, Filename}:
425                  Reads the file Filename and creates raw profile data that is
426                  stored  in  RAM  by  the fprof server. If the option file is
427                  given,  or  none  of  these  options  are  given,  the  file
428                  "fprof.trace"  is  read. The call will return when the whole
429                  trace has been read with the return value ok if  successful.
430                  This option is not allowed with the start or stop options.
431
432                dump | {dump, Dump}:
433                  Specifies  the  destination for the trace text dump. If this
434                  option is not given, no dump is generated, if it is dump the
435                  destination will be the caller's group leader, otherwise the
436                  destination Dump is either the pid of an  I/O  device  or  a
437                  filename. And, finally, if the filename is [] - "fprof.dump"
438                  is used instead. This option is not allowed  with  the  stop
439                  option.
440
441                append:
442                  Causes the trace text dump to be appended to the destination
443                  file. This option is only allowed with the {dump,  Dumpfile}
444                  option.
445
446                start:
447                  Starts a tracer process that profiles trace data in runtime.
448                  The call will return immediately with the return value  {ok,
449                  Tracer}  if  successful. This option is not allowed with the
450                  stop, file or {file, Filename} options.
451
452                stop:
453                  Stops the tracer process that profiles trace  data  in  run‐
454                  time.  The return value will be value ok if successful. This
455                  option is not allowed with the start, file or  {file,  File‐
456                  name} options.
457
458       analyse() -> ok | {error, Reason} | {'EXIT', ServerPid, Reason}
459
460              Types:
461
462                 ServerPid = pid()
463                 Reason = term()
464
465              Same as analyse([]).
466
467       analyse(OptionName, OptionValue) ->
468                  ok | {error, Reason} | {'EXIT', ServerPid, Reason}
469
470              Types:
471
472                 OptionName = atom()
473                 OptionValue = term()
474                 ServerPid = pid()
475                 Reason = term()
476
477              Same as analyse([{OptionName, OptionValue}]).
478
479       analyse(OptionName) ->
480                  ok | {error, Reason} | {'EXIT', ServerPid, Reason}
481
482              Types:
483
484                 OptionName = atom()
485                 ServerPid = pid()
486                 Reason = term()
487
488              Same as analyse([OptionName]).
489
490       analyse(Option :: {OptionName, OptionValue}) ->
491                  ok | {error, Reason} | {'EXIT', ServerPid, Reason}
492
493              Types:
494
495                 OptionName = atom()
496                 OptionValue = term()
497                 ServerPid = pid()
498                 Reason = term()
499
500              Same as analyse([{OptionName, OptionValue}]).
501
502       analyse(OptionList) ->
503                  ok | {error, Reason} | {'EXIT', ServerPid, Reason}
504
505              Types:
506
507                 OptionList = [Option]
508                 Option = analyse_option()
509                 ServerPid = pid()
510                 Reason = term()
511                 analyse_option() =
512                     append | callers |
513                     {callers, boolean()} |
514                     {cols, Cols :: integer() >= 0} |
515                     dest |
516                     {dest, Dest :: pid() | (Destfile :: file:filename())} |
517                     details |
518                     {details, boolean()} |
519                     no_callers | no_details |
520                     {sort, SortSpec :: acc | own} |
521                     totals |
522                     {totals, boolean()}
523
524              Analyses  raw  profile data in the fprof server. If called while
525              there is no raw profile data available, {error,  no_profile}  is
526              returned.
527
528              Destfile is used to call file:open/2.
529
530              Option description:
531
532                dest | {dest, Dest}:
533                  Specifies  the  destination for the analysis. If this option
534                  is not given or it is dest,  the  destination  will  be  the
535                  caller's group leader, otherwise the destination Dest is ei‐
536                  ther the pid() of an I/O device or a filename. And, finally,
537                  if the filename is [] - "fprof.analysis" is used instead.
538
539                append:
540                  Causes  the analysis to be appended to the destination file.
541                  This option is only allowed with the  {dest,  Destfile}  op‐
542                  tion.
543
544                {cols, Cols}:
545                  Specifies  the  number  of  columns in the analysis text. If
546                  this option is not given the number of columns is set to 80.
547
548                callers | {callers, true}:
549                  Prints callers and called information in the analysis.  This
550                  is the default.
551
552                {callers, false} | no_callers:
553                  Suppresses the printing of callers and called information in
554                  the analysis.
555
556                {sort, SortSpec}:
557                  Specifies if the analysis should be sorted according to  the
558                  ACC  column,  which  is  the default, or the OWN column. See
559                  Analysis Format below.
560
561                totals | {totals, true}:
562                  Includes a section containing call statistics for all  calls
563                  regardless of process, in the analysis.
564
565                {totals, false}:
566                  Supresses  the  totals section in the analysis, which is the
567                  default.
568
569                details | {details, true}:
570                  Prints call statistics for each  process  in  the  analysis.
571                  This is the default.
572
573                {details, false} | no_details:
574                  Suppresses  the  call  statistics  for each process from the
575                  analysis.
576

ANALYSIS FORMAT

578       This section describes the output format of the  analyse  command.  See
579       analyse/0.
580
581       The  format is parsable with the standard Erlang parsing tools erl_scan
582       and erl_parse, file:consult/1 or io:read/2. The parse format is not ex‐
583       plained here - it should be easy for the interested to try it out. Note
584       that some flags to analyse/1 will affect the format.
585
586       The following example was run on OTP/R8 on Solaris 8, all OTP internals
587       in this example are very version dependent.
588
589       As  an example, we will use the following function, that you may recog‐
590       nise as  a  slightly  modified  benchmark  function  from  the  manpage
591       file(3):
592
593       -module(foo).
594       -export([create_file_slow/2]).
595
596       create_file_slow(Name, N) when is_integer(N), N >= 0 ->
597           {ok, FD} =
598               file:open(Name, [raw, write, delayed_write, binary]),
599           if N > 256 ->
600                   ok = file:write(FD,
601                                   lists:map(fun (X) -> <<X:32/unsigned>> end,
602                                   lists:seq(0, 255))),
603                   ok = create_file_slow(FD, 256, N);
604              true ->
605                   ok = create_file_slow(FD, 0, N)
606           end,
607           ok = file:close(FD).
608
609       create_file_slow(FD, M, M) ->
610           ok;
611       create_file_slow(FD, M, N) ->
612           ok = file:write(FD, <<M:32/unsigned>>),
613           create_file_slow(FD, M+1, N).
614
615       Let us have a look at the printout after running:
616
617       1> fprof:apply(foo, create_file_slow, [junk, 1024]).
618       2> fprof:profile().
619       3> fprof:analyse().
620
621       The printout starts with:
622
623       %% Analysis results:
624       {  analysis_options,
625        [{callers, true},
626         {sort, acc},
627         {totals, false},
628         {details, true}]}.
629
630       %                                       CNT       ACC       OWN
631       [{ totals,                             9627, 1691.119, 1659.074}].  %%%
632
633       The  CNT column shows the total number of function calls that was found
634       in the trace. In the ACC column is the total time  of  the  trace  from
635       first timestamp to last. And in the OWN column is the sum of the execu‐
636       tion time in functions found in the trace, not including  called  func‐
637       tions. In this case it is very close to the ACC time since the emulator
638       had practically nothing else to do than to execute our test program.
639
640       All time values in the printout are in milliseconds.
641
642       The printout continues:
643
644       %                                       CNT       ACC       OWN
645       [{ "<0.28.0>",                         9627,undefined, 1659.074}].   %%
646
647       This is the printout header of one process. The printout contains  only
648       this  one process since we did fprof:apply/3 which traces only the cur‐
649       rent process. Therefore the CNT and OWN columns perfectly  matches  the
650       totals  above.  The ACC column is undefined since summing the ACC times
651       of all calls in the process makes no sense - you  would  get  something
652       like the ACC value from totals above multiplied by the average depth of
653       the call stack, or something.
654
655       All paragraphs up to the next process  header  only  concerns  function
656       calls within this process.
657
658       Now we come to something more interesting:
659
660       {[{undefined,                             0, 1691.076,    0.030}],
661        { {fprof,apply_start_stop,4},            0, 1691.076,    0.030},     %
662        [{{foo,create_file_slow,2},              1, 1691.046,    0.103},
663         {suspend,                               1,    0.000,    0.000}]}.
664
665       {[{{fprof,apply_start_stop,4},            1, 1691.046,    0.103}],
666        { {foo,create_file_slow,2},              1, 1691.046,    0.103},     %
667        [{{file,close,1},                        1, 1398.873,    0.019},
668         {{foo,create_file_slow,3},              1,  249.678,    0.029},
669         {{file,open,2},                         1,   20.778,    0.055},
670         {{lists,map,2},                         1,   16.590,    0.043},
671         {{lists,seq,2},                         1,    4.708,    0.017},
672         {{file,write,2},                        1,    0.316,    0.021}]}.
673
674       The  printout  consists of one paragraph per called function. The func‐
675       tion marked with '%' is the  one  the  paragraph  concerns  -  foo:cre‐
676       ate_file_slow/2.  Above the marked function are the calling functions -
677       those that has called the marked, and below are  those  called  by  the
678       marked function.
679
680       The  paragraphs  are  per default sorted in decreasing order of the ACC
681       column for the marked function. The calling list and called list within
682       one  paragraph are also per default sorted in decreasing order of their
683       ACC column.
684
685       The columns are: CNT - the  number  of  times  the  function  has  been
686       called,  ACC  -  the  time spent in the function including called func‐
687       tions, and OWN - the time spent in the function  not  including  called
688       functions.
689
690       The  rows  for  the calling functions contain statistics for the marked
691       function with the constraint that only the occasions when  a  call  was
692       made from the row's function to the marked function are accounted for.
693
694       The  row for the marked function simply contains the sum of all calling
695       rows.
696
697       The rows for the called functions contains  statistics  for  the  row's
698       function  with  the  constraint that only the occasions when a call was
699       made from the marked to the row's function are accounted for.
700
701       So, we see that foo:create_file_slow/2 used very little  time  for  its
702       own  execution. It spent most of its time in file:close/1. The function
703       foo:create_file_slow/3 that writes 3/4 of the file contents is the sec‐
704       ond biggest time thief.
705
706       We  also  see that the call to file:write/2 that writes 1/4 of the file
707       contents takes very little time in itself. What takes time is to  build
708       the data (lists:seq/2 and lists:map/2).
709
710       The function 'undefined' that has called fprof:apply_start_stop/4 is an
711       unknown function because that call was not recorded in  the  trace.  It
712       was   only   recorded   that  the  execution  returned  from  fprof:ap‐
713       ply_start_stop/4 to some other function above in  the  call  stack,  or
714       that the process exited from there.
715
716       Let us continue down the printout to find:
717
718       {[{{foo,create_file_slow,2},              1,  249.678,    0.029},
719         {{foo,create_file_slow,3},            768,    0.000,   23.294}],
720        { {foo,create_file_slow,3},            769,  249.678,   23.323},     %
721        [{{file,write,2},                      768,  220.314,   14.539},
722         {suspend,                              57,    6.041,    0.000},
723         {{foo,create_file_slow,3},            768,    0.000,   23.294}]}.
724
725       If  you  compare  with  the  code you will see there also that foo:cre‐
726       ate_file_slow/3 was called only from foo:create_file_slow/2 and itself,
727       and called only file:write/2, note the number of calls to file:write/2.
728       But here we see that suspend was called a few times. This is  a  pseudo
729       function  that indicates that the process was suspended while executing
730       in foo:create_file_slow/3,  and  since  there  is  no  receive  or  er‐
731       lang:yield/0  in the code, it must be Erlang scheduling suspensions, or
732       the trace file driver compensating  for  large  file  write  operations
733       (these  are regarded as a schedule out followed by a schedule in to the
734       same process).
735
736       Let us find the suspend entry:
737
738       {[{{file,write,2},                       53,    6.281,    0.000},
739         {{foo,create_file_slow,3},             57,    6.041,    0.000},
740         {{prim_file,drv_command,4},            50,    4.582,    0.000},
741         {{prim_file,drv_get_response,1},       34,    2.986,    0.000},
742         {{lists,map,2},                        10,    2.104,    0.000},
743         {{prim_file,write,2},                  17,    1.852,    0.000},
744         {{erlang,port_command,2},              15,    1.713,    0.000},
745         {{prim_file,drv_command,2},            22,    1.482,    0.000},
746         {{prim_file,translate_response,2},     11,    1.441,    0.000},
747         {{prim_file,'-drv_command/2-fun-0-',1},  15,    1.340,    0.000},
748         {{lists,seq,4},                         3,    0.880,    0.000},
749         {{foo,'-create_file_slow/2-fun-0-',1},   5,    0.523,    0.000},
750         {{erlang,bump_reductions,1},            4,    0.503,    0.000},
751         {{prim_file,open_int_setopts,3},        1,    0.165,    0.000},
752         {{prim_file,i32,4},                     1,    0.109,    0.000},
753         {{fprof,apply_start_stop,4},            1,    0.000,    0.000}],
754        { suspend,                             299,   32.002,    0.000},     %
755        [ ]}.
756
757       We find no particulary long suspend times, so no function seems to have
758       waited  in  a receive statement. Actually, prim_file:drv_command/4 con‐
759       tains a receive statement, but in this test program, the  message  lies
760       in the process receive buffer when the receive statement is entered. We
761       also see that the total suspend time for the test run is small.
762
763       The suspend pseudo function has got an OWN time of  zero.  This  is  to
764       prevent  the  process total OWN time from including time in suspension.
765       Whether suspend time is really ACC or OWN time is more of a philosophi‐
766       cal question.
767
768       Now we look at another interesting pseudo function, garbage_collect:
769
770       {[{{prim_file,drv_command,4},            25,    0.873,    0.873},
771         {{prim_file,write,2},                  16,    0.692,    0.692},
772         {{lists,map,2},                         2,    0.195,    0.195}],
773        { garbage_collect,                      43,    1.760,    1.760},     %
774        [ ]}.
775
776       Here  we  see that no function distinguishes itself considerably, which
777       is very normal.
778
779       The garbage_collect pseudo function has not got an  OWN  time  of  zero
780       like suspend, instead it is equal to the ACC time.
781
782       Garbage  collect  often  occurs while a process is suspended, but fprof
783       hides this fact by pretending that the suspended function was first un‐
784       suspended and then garbage collected. Otherwise the printout would show
785       garbage_collect being called from suspend but not which  function  that
786       might have caused the garbage collection.
787
788       Let us now get back to the test code:
789
790       {[{{foo,create_file_slow,3},            768,  220.314,   14.539},
791         {{foo,create_file_slow,2},              1,    0.316,    0.021}],
792        { {file,write,2},                      769,  220.630,   14.560},     %
793        [{{prim_file,write,2},                 769,  199.789,   22.573},
794         {suspend,                              53,    6.281,    0.000}]}.
795
796       Not  unexpectedly,  we  see  that file:write/2 was called from foo:cre‐
797       ate_file_slow/3 and foo:create_file_slow/2. The number of calls in each
798       case  as  well as the used time are also just confirms the previous re‐
799       sults.
800
801       We see that file:write/2 only calls prim_file:write/2, but let  us  re‐
802       frain from digging into the internals of the kernel application.
803
804       But,  if  we nevertheless do dig down we find the call to the linked in
805       driver that does the file operations towards the host operating system:
806
807       {[{{prim_file,drv_command,4},           772, 1458.356, 1456.643}],
808        { {erlang,port_command,2},             772, 1458.356, 1456.643},     %
809        [{suspend,                              15,    1.713,    0.000}]}.
810
811       This is 86 % of the total run time, and as we  saw  before  it  is  the
812       close  operation the absolutely biggest contributor. We find a compari‐
813       son ratio a little bit up in the call stack:
814
815       {[{{prim_file,close,1},                   1, 1398.748,    0.024},
816         {{prim_file,write,2},                 769,  174.672,   12.810},
817         {{prim_file,open_int,4},                1,   19.755,    0.017},
818         {{prim_file,open_int_setopts,3},        1,    0.147,    0.016}],
819        { {prim_file,drv_command,2},           772, 1593.322,   12.867},     %
820        [{{prim_file,drv_command,4},           772, 1578.973,   27.265},
821         {suspend,                              22,    1.482,    0.000}]}.
822
823       The time for file operations in the linked in driver distributes itself
824       as  1 % for open, 11 % for write and 87 % for close. All data is proba‐
825       bly buffered in the operating system until the close.
826
827       The  unsleeping  reader   may   notice   that   the   ACC   times   for
828       prim_file:drv_command/2  and  prim_file:drv_command/4  is not equal be‐
829       tween the paragraphs above, even though it  is  easy  to  believe  that
830       prim_file:drv_command/2 is just a passthrough function.
831
832       The  missing  time can be found in the paragraph for prim_file:drv_com‐
833       mand/4 where it is evident that  not  only  prim_file:drv_command/2  is
834       called but also a fun:
835
836       {[{{prim_file,drv_command,2},           772, 1578.973,   27.265}],
837        { {prim_file,drv_command,4},           772, 1578.973,   27.265},     %
838        [{{erlang,port_command,2},             772, 1458.356, 1456.643},
839         {{prim_file,'-drv_command/2-fun-0-',1}, 772,   87.897,   12.736},
840         {suspend,                              50,    4.582,    0.000},
841         {garbage_collect,                      25,    0.873,    0.873}]}.
842
843       And  some  more  missing  time  can  be  explained  by  the  fact  that
844       prim_file:open_int/4 both  calls  prim_file:drv_command/2  directly  as
845       well  as  through  prim_file:open_int_setopts/3,  which complicates the
846       picture.
847
848       {[{{prim_file,open,2},                    1,   20.309,    0.029},
849         {{prim_file,open_int,4},                1,    0.000,    0.057}],
850        { {prim_file,open_int,4},                2,   20.309,    0.086},     %
851        [{{prim_file,drv_command,2},             1,   19.755,    0.017},
852         {{prim_file,open_int_setopts,3},        1,    0.360,    0.032},
853         {{prim_file,drv_open,2},                1,    0.071,    0.030},
854         {{erlang,list_to_binary,1},             1,    0.020,    0.020},
855         {{prim_file,i32,1},                     1,    0.017,    0.017},
856         {{prim_file,open_int,4},                1,    0.000,    0.057}]}.
857       {[{{prim_file,open_int,4},                1,    0.360,    0.032},
858         {{prim_file,open_int_setopts,3},        1,    0.000,    0.016}],
859        { {prim_file,open_int_setopts,3},        2,    0.360,    0.048},     %
860        [{suspend,                               1,    0.165,    0.000},
861         {{prim_file,drv_command,2},             1,    0.147,    0.016},
862         {{prim_file,open_int_setopts,3},        1,    0.000,    0.016}]}.
863

NOTES

865       The actual supervision of execution times is in itself a CPU  intensive
866       activity.  A  message  is  written on the trace file for every function
867       call that is made by the profiled code.
868
869       The ACC time calculation is sometimes difficult to make correct,  since
870       it  is difficult to define. This happens especially when a function oc‐
871       curs in several instances in the call stack, for example by calling it‐
872       self  perhaps  through other functions and perhaps even non-tail recur‐
873       sively.
874
875       To produce sensible results, fprof tries not  to  charge  any  function
876       more  than once for ACC time. The instance highest up (with longest du‐
877       ration) in the call stack is chosen.
878
879       Sometimes a function may unexpectedly waste a lot (some 10 ms  or  more
880       depending  on  host  machine  OS) of OWN (and ACC) time, even functions
881       that do practically nothing at all. The problem may be that the OS  has
882       chosen  to  schedule out the Erlang runtime system process for a while,
883       and if the OS does not support high resolution  cpu  time  measurements
884       fprof  will use wallclock time for its calculations, and it will appear
885       as functions randomly burn virtual machine time.
886

SEE ALSO

888       dbg(3), eprof(3), erlang(3), io(3), Tools User's Guide
889
890
891
892Ericsson AB                       tools 3.5.2                         fprof(3)
Impressum