1fprof(3) Erlang Module Definition fprof(3)
2
3
4
6 fprof - A Time Profiling Tool using trace to file for minimal runtime
7 performance impact.
8
10 This module is used to profile a program to find out how the execution
11 time is used. Trace to file is used to minimize runtime performance im‐
12 pact.
13
14 The fprof module uses tracing to collect profiling data, hence there is
15 no need for special compilation of any module to be profiled. When it
16 starts tracing, fprof will erase all previous tracing in the node and
17 set the necessary trace flags on the profiling target processes as well
18 as local call trace on all functions in all loaded modules and all mod‐
19 ules to be loaded. fprof erases all tracing in the node when it stops
20 tracing.
21
22 fprof presents both own time i.e how much time a function has used for
23 its own execution, and accumulated time i.e including called functions.
24 All presented times are collected using trace timestamps. fprof tries
25 to collect cpu time timestamps, if the host machine OS supports it.
26 Therefore the times may be wallclock times and OS scheduling will ran‐
27 domly strike all called functions in a presumably fair way.
28
29 If, however, the profiling time is short, and the host machine OS does
30 not support high resolution cpu time measurements, some few OS schedul‐
31 ings may show up as ridiculously long execution times for functions do‐
32 ing practically nothing. An example of a function more or less just
33 composing a tuple in about 100 times the normal execution time has been
34 seen, and when the tracing was repeated, the execution time became nor‐
35 mal.
36
37 Profiling is essentially done in 3 steps:
38
39 1:
40 Tracing; to file, as mentioned in the previous paragraph. The trace
41 contains entries for function calls, returns to function, process
42 scheduling, other process related (spawn, etc) events, and garbage
43 collection. All trace entries are timestamped.
44
45 2:
46 Profiling; the trace file is read, the execution call stack is sim‐
47 ulated, and raw profile data is calculated from the simulated call
48 stack and the trace timestamps. The profile data is stored in the
49 fprof server state. During this step the trace data may be dumped
50 in text format to file or console.
51
52 3:
53 Analysing; the raw profile data is sorted, filtered and dumped in
54 text format either to file or console. The text format intended to
55 be both readable for a human reader, as well as parsable with the
56 standard erlang parsing tools.
57
58 Since fprof uses trace to file, the runtime performance degradation is
59 minimized, but still far from negligible, especially for programs that
60 use the filesystem heavily by themselves. Where you place the trace
61 file is also important, e.g on Solaris /tmp is usually a good choice
62 since it is essentially a RAM disk, while any NFS (network) mounted
63 disk is a bad idea.
64
65 fprof can also skip the file step and trace to a tracer process that
66 does the profiling in runtime.
67
69 start() -> {ok, Pid} | {error, {already_started, Pid}}
70
71 Types:
72
73 Pid = pid()
74
75 Starts the fprof server.
76
77 Note that it seldom needs to be started explicitly since it is
78 automatically started by the functions that need a running
79 server.
80
81 stop() -> ok
82
83 Same as stop(normal).
84
85 stop(Reason) -> ok
86
87 Types:
88
89 Reason = term()
90
91 Stops the fprof server.
92
93 The supplied Reason becomes the exit reason for the server
94 process. Default Any Reason other than kill sends a request to
95 the server and waits for it to clean up, reply and exit. If Rea‐
96 son is kill, the server is bluntly killed.
97
98 If the fprof server is not running, this function returns imme‐
99 diately with the same return value.
100
101 Note:
102 When the fprof server is stopped the collected raw profile data
103 is lost.
104
105
106 apply(Func, Args) -> term()
107
108 Types:
109
110 Func = function() | {Module :: module(), Function :: atom()}
111 Args = [term()]
112
113 Same as apply(Func, Args, []).
114
115 apply(Module, Function, Args) -> term()
116
117 Types:
118
119 Module = module()
120 Function = atom()
121 Args = [term()]
122
123 Same as apply({Module, Function}, Args, []).
124
125 apply(Func, Args, OptionList) -> term()
126
127 Types:
128
129 Func = function() | {Module :: module(), Function :: atom()}
130 Args = [term()]
131 OptionList = [Option]
132 Option = apply_option()
133 apply_option() =
134 continue |
135 {procs, PidList :: [pid()]} |
136 start |
137 (TraceStartOption :: trace_option())
138 trace_option() =
139 cpu_time |
140 {cpu_time, boolean()} |
141 file |
142 {file, Filename :: file:filename()} |
143 {procs, PidSpec :: pid_spec()} |
144 {procs, [PidSpec :: pid_spec()]} |
145 start | stop |
146 {tracer, Tracer :: pid() | port()} |
147 verbose |
148 {verbose, boolean()}
149 pid_spec() = pid() | atom()
150
151 Calls erlang:apply(Func, Args) surrounded by trace([start, ...])
152 and trace(stop).
153
154 Some effort is made to keep the trace clean from unnecessary
155 trace messages; tracing is started and stopped from a spawned
156 process while the erlang:apply/2 call is made in the current
157 process, only surrounded by receive and send statements towards
158 the trace starting process. The trace starting process exits
159 when not needed any more.
160
161 The TraceStartOption is any option allowed for trace/1. The op‐
162 tions [start, {procs, [self() | PidList]} | OptList] are given
163 to trace/1, where OptList is OptionList with continue, start and
164 {procs, _} options removed.
165
166 The continue option inhibits the call to trace(stop) and leaves
167 it up to the caller to stop tracing at a suitable time.
168
169 apply(Module, Function, Args, OptionList) -> term()
170
171 Types:
172
173 Module = module()
174 Function = atom()
175 Args = [term()]
176 OptionList = [Option]
177 Option = apply_option()
178 apply_option() =
179 continue |
180 {procs, PidList :: [pid()]} |
181 start |
182 (TraceStartOption :: trace_option())
183 trace_option() =
184 cpu_time |
185 {cpu_time, boolean()} |
186 file |
187 {file, Filename :: file:filename()} |
188 {procs, PidSpec :: pid_spec()} |
189 {procs, [PidSpec :: pid_spec()]} |
190 start | stop |
191 {tracer, Tracer :: pid() | port()} |
192 verbose |
193 {verbose, boolean()}
194 pid_spec() = pid() | atom()
195
196 Same as apply({Module, Function}, Args, OptionList).
197
198 OptionList is an option list allowed for apply/3.
199
200 trace(OptionName :: start, Filename) ->
201 ok | {error, Reason} | {'EXIT', ServerPid, Reason}
202
203 Types:
204
205 Filename = file:filename()
206 ServerPid = pid()
207 Reason = term()
208
209 Same as trace([start, {file, Filename}]).
210
211 trace(OptionName :: verbose, Filename) ->
212 ok | {error, Reason} | {'EXIT', ServerPid, Reason}
213
214 Types:
215
216 Filename = file:filename()
217 ServerPid = pid()
218 Reason = term()
219
220 Same as trace([start, verbose, {file, Filename}]).
221
222 trace(OptionName, OptionValue) ->
223 ok | {error, Reason} | {'EXIT', ServerPid, Reason}
224
225 Types:
226
227 OptionName = atom()
228 OptionValue = term()
229 ServerPid = pid()
230 Reason = term()
231
232 Same as trace([{OptionName, OptionValue}]).
233
234 trace(Option :: verbose) ->
235 ok | {error, Reason} | {'EXIT', ServerPid, Reason}
236
237 Types:
238
239 ServerPid = pid()
240 Reason = term()
241
242 Same as trace([start, verbose]).
243
244 trace(OptionName) ->
245 ok | {error, Reason} | {'EXIT', ServerPid, Reason}
246
247 Types:
248
249 OptionName = atom()
250 ServerPid = pid()
251 Reason = term()
252
253 Same as trace([OptionName]).
254
255 trace(Option :: {OptionName, OptionValue}) ->
256 ok | {error, Reason} | {'EXIT', ServerPid, Reason}
257
258 Types:
259
260 OptionName = atom()
261 OptionValue = term()
262 ServerPid = pid()
263 Reason = term()
264
265 Same as trace([{OptionName, OptionValue}]).
266
267 trace(OptionList) ->
268 ok | {error, Reason} | {'EXIT', ServerPid, Reason}
269
270 Types:
271
272 OptionList = [Option]
273 Option = trace_option()
274 ServerPid = pid()
275 Reason = term()
276 trace_option() =
277 cpu_time |
278 {cpu_time, boolean()} |
279 file |
280 {file, Filename :: file:filename()} |
281 {procs, PidSpec :: pid_spec()} |
282 {procs, [PidSpec :: pid_spec()]} |
283 start | stop |
284 {tracer, Tracer :: pid() | port()} |
285 verbose |
286 {verbose, boolean()}
287 pid_spec() = pid() | atom()
288
289 Starts or stops tracing.
290
291 PidSpec and Tracer are used in calls to erlang:trace(PidSpec,
292 true, [{tracer, Tracer} | Flags]), and Filename is used to call
293 dbg:trace_port(file, Filename). Please see erlang:trace/3 and
294 dbg:trace_port/2.
295
296 Option description:
297
298 stop:
299 Stops a running fprof trace and clears all tracing from the
300 node. Either option stop or start must be specified, but not
301 both.
302
303 start:
304 Clears all tracing from the node and starts a new fprof
305 trace. Either option start or stop must be specified, but
306 not both.
307
308 verbose | {verbose, boolean()}:
309 The options verbose or {verbose, true} adds some trace flags
310 that fprof does not need, but that may be interesting for
311 general debugging purposes. This option is only allowed with
312 the start option.
313
314 cpu_time | {cpu_time, boolean()}:
315 The options cpu_time or {cpu_time, true} makes the time‐
316 stamps in the trace be in CPU time instead of wallclock time
317 which is the default. This option is only allowed with the
318 start option.
319
320 Warning:
321 Getting correct values out of cpu_time can be difficult. The
322 best way to get correct values is to run using a single sched‐
323 uler and bind that scheduler to a specific CPU, i.e. erl +S 1
324 +sbt db.
325
326
327 {procs, PidSpec} | {procs, [PidSpec]}:
328 Specifies which processes that shall be traced. If this op‐
329 tion is not given, the calling process is traced. All pro‐
330 cesses spawned by the traced processes are also traced. This
331 option is only allowed with the start option.
332
333 file | {file, Filename}:
334 Specifies the filename of the trace. If the option file is
335 given, or none of these options are given, the file
336 "fprof.trace" is used. This option is only allowed with the
337 start option, but not with the {tracer, Tracer} option.
338
339 {tracer, Tracer}:
340 Specifies that trace to process or port shall be done in‐
341 stead of trace to file. This option is only allowed with the
342 start option, but not with the {file, Filename} option.
343
344 profile() -> ok | {error, Reason} | {'EXIT', ServerPid, Reason}
345
346 Types:
347
348 ServerPid = pid()
349 Reason = term()
350
351 Same as profile([]).
352
353 profile(OptionName, OptionValue) ->
354 ok |
355 {ok, Tracer} |
356 {error, Reason} |
357 {'EXIT', ServerPid, Reason}
358
359 Types:
360
361 OptionName = atom()
362 OptionValue = term()
363 Tracer = ServerPid = pid()
364 Reason = term()
365
366 Same as profile([{OptionName, OptionValue}]).
367
368 profile(OptionName) ->
369 ok |
370 {ok, Tracer} |
371 {error, Reason} |
372 {'EXIT', ServerPid, Reason}
373
374 Types:
375
376 OptionName = atom()
377 Tracer = ServerPid = pid()
378 Reason = term()
379
380 Same as profile([OptionName]).
381
382 profile(Option :: {OptionName, OptionValue}) ->
383 ok |
384 {ok, Tracer} |
385 {error, Reason} |
386 {'EXIT', ServerPid, Reason}
387
388 Types:
389
390 OptionName = atom()
391 OptionValue = term()
392 Tracer = ServerPid = pid()
393 Reason = term()
394
395 Same as profile([{OptionName, OptionValue}]).
396
397 profile(OptionList) ->
398 ok |
399 {ok, Tracer} |
400 {error, Reason} |
401 {'EXIT', ServerPid, Reason}
402
403 Types:
404
405 OptionList = [Option]
406 Option = profile_option()
407 Tracer = ServerPid = pid()
408 Reason = term()
409 profile_option() =
410 append | dump |
411 {dump, pid() | (Dump :: (Dumpfile :: file:filename() | []))} |
412 file |
413 {file, Filename :: file:filename()} |
414 start | stop
415
416 Compiles a trace into raw profile data held by the fprof server.
417
418 Dumpfile is used to call file:open/2, and Filename is used to
419 call dbg:trace_port(file, Filename). Please see file:open/2 and
420 dbg:trace_port/2.
421
422 Option description:
423
424 file | {file, Filename}:
425 Reads the file Filename and creates raw profile data that is
426 stored in RAM by the fprof server. If the option file is
427 given, or none of these options are given, the file
428 "fprof.trace" is read. The call will return when the whole
429 trace has been read with the return value ok if successful.
430 This option is not allowed with the start or stop options.
431
432 dump | {dump, Dump}:
433 Specifies the destination for the trace text dump. If this
434 option is not given, no dump is generated, if it is dump the
435 destination will be the caller's group leader, otherwise the
436 destination Dump is either the pid of an I/O device or a
437 filename. And, finally, if the filename is [] - "fprof.dump"
438 is used instead. This option is not allowed with the stop
439 option.
440
441 append:
442 Causes the trace text dump to be appended to the destination
443 file. This option is only allowed with the {dump, Dumpfile}
444 option.
445
446 start:
447 Starts a tracer process that profiles trace data in runtime.
448 The call will return immediately with the return value {ok,
449 Tracer} if successful. This option is not allowed with the
450 stop, file or {file, Filename} options.
451
452 stop:
453 Stops the tracer process that profiles trace data in run‐
454 time. The return value will be value ok if successful. This
455 option is not allowed with the start, file or {file, File‐
456 name} options.
457
458 analyse() -> ok | {error, Reason} | {'EXIT', ServerPid, Reason}
459
460 Types:
461
462 ServerPid = pid()
463 Reason = term()
464
465 Same as analyse([]).
466
467 analyse(OptionName, OptionValue) ->
468 ok | {error, Reason} | {'EXIT', ServerPid, Reason}
469
470 Types:
471
472 OptionName = atom()
473 OptionValue = term()
474 ServerPid = pid()
475 Reason = term()
476
477 Same as analyse([{OptionName, OptionValue}]).
478
479 analyse(OptionName) ->
480 ok | {error, Reason} | {'EXIT', ServerPid, Reason}
481
482 Types:
483
484 OptionName = atom()
485 ServerPid = pid()
486 Reason = term()
487
488 Same as analyse([OptionName]).
489
490 analyse(Option :: {OptionName, OptionValue}) ->
491 ok | {error, Reason} | {'EXIT', ServerPid, Reason}
492
493 Types:
494
495 OptionName = atom()
496 OptionValue = term()
497 ServerPid = pid()
498 Reason = term()
499
500 Same as analyse([{OptionName, OptionValue}]).
501
502 analyse(OptionList) ->
503 ok | {error, Reason} | {'EXIT', ServerPid, Reason}
504
505 Types:
506
507 OptionList = [Option]
508 Option = analyse_option()
509 ServerPid = pid()
510 Reason = term()
511 analyse_option() =
512 append | callers |
513 {callers, boolean()} |
514 {cols, Cols :: integer() >= 0} |
515 dest |
516 {dest, Dest :: pid() | (Destfile :: file:filename())} |
517 details |
518 {details, boolean()} |
519 no_callers | no_details |
520 {sort, SortSpec :: acc | own} |
521 totals |
522 {totals, boolean()}
523
524 Analyses raw profile data in the fprof server. If called while
525 there is no raw profile data available, {error, no_profile} is
526 returned.
527
528 Destfile is used to call file:open/2.
529
530 Option description:
531
532 dest | {dest, Dest}:
533 Specifies the destination for the analysis. If this option
534 is not given or it is dest, the destination will be the
535 caller's group leader, otherwise the destination Dest is ei‐
536 ther the pid() of an I/O device or a filename. And, finally,
537 if the filename is [] - "fprof.analysis" is used instead.
538
539 append:
540 Causes the analysis to be appended to the destination file.
541 This option is only allowed with the {dest, Destfile} op‐
542 tion.
543
544 {cols, Cols}:
545 Specifies the number of columns in the analysis text. If
546 this option is not given the number of columns is set to 80.
547
548 callers | {callers, true}:
549 Prints callers and called information in the analysis. This
550 is the default.
551
552 {callers, false} | no_callers:
553 Suppresses the printing of callers and called information in
554 the analysis.
555
556 {sort, SortSpec}:
557 Specifies if the analysis should be sorted according to the
558 ACC column, which is the default, or the OWN column. See
559 Analysis Format below.
560
561 totals | {totals, true}:
562 Includes a section containing call statistics for all calls
563 regardless of process, in the analysis.
564
565 {totals, false}:
566 Supresses the totals section in the analysis, which is the
567 default.
568
569 details | {details, true}:
570 Prints call statistics for each process in the analysis.
571 This is the default.
572
573 {details, false} | no_details:
574 Suppresses the call statistics for each process from the
575 analysis.
576
578 This section describes the output format of the analyse command. See
579 analyse/0.
580
581 The format is parsable with the standard Erlang parsing tools erl_scan
582 and erl_parse, file:consult/1 or io:read/2. The parse format is not ex‐
583 plained here - it should be easy for the interested to try it out. Note
584 that some flags to analyse/1 will affect the format.
585
586 The following example was run on OTP/R8 on Solaris 8, all OTP internals
587 in this example are very version dependent.
588
589 As an example, we will use the following function, that you may recog‐
590 nise as a slightly modified benchmark function from the manpage
591 file(3):
592
593 -module(foo).
594 -export([create_file_slow/2]).
595
596 create_file_slow(Name, N) when is_integer(N), N >= 0 ->
597 {ok, FD} =
598 file:open(Name, [raw, write, delayed_write, binary]),
599 if N > 256 ->
600 ok = file:write(FD,
601 lists:map(fun (X) -> <<X:32/unsigned>> end,
602 lists:seq(0, 255))),
603 ok = create_file_slow(FD, 256, N);
604 true ->
605 ok = create_file_slow(FD, 0, N)
606 end,
607 ok = file:close(FD).
608
609 create_file_slow(FD, M, M) ->
610 ok;
611 create_file_slow(FD, M, N) ->
612 ok = file:write(FD, <<M:32/unsigned>>),
613 create_file_slow(FD, M+1, N).
614
615 Let us have a look at the printout after running:
616
617 1> fprof:apply(foo, create_file_slow, [junk, 1024]).
618 2> fprof:profile().
619 3> fprof:analyse().
620
621 The printout starts with:
622
623 %% Analysis results:
624 { analysis_options,
625 [{callers, true},
626 {sort, acc},
627 {totals, false},
628 {details, true}]}.
629
630 % CNT ACC OWN
631 [{ totals, 9627, 1691.119, 1659.074}]. %%%
632
633 The CNT column shows the total number of function calls that was found
634 in the trace. In the ACC column is the total time of the trace from
635 first timestamp to last. And in the OWN column is the sum of the execu‐
636 tion time in functions found in the trace, not including called func‐
637 tions. In this case it is very close to the ACC time since the emulator
638 had practically nothing else to do than to execute our test program.
639
640 All time values in the printout are in milliseconds.
641
642 The printout continues:
643
644 % CNT ACC OWN
645 [{ "<0.28.0>", 9627,undefined, 1659.074}]. %%
646
647 This is the printout header of one process. The printout contains only
648 this one process since we did fprof:apply/3 which traces only the cur‐
649 rent process. Therefore the CNT and OWN columns perfectly matches the
650 totals above. The ACC column is undefined since summing the ACC times
651 of all calls in the process makes no sense - you would get something
652 like the ACC value from totals above multiplied by the average depth of
653 the call stack, or something.
654
655 All paragraphs up to the next process header only concerns function
656 calls within this process.
657
658 Now we come to something more interesting:
659
660 {[{undefined, 0, 1691.076, 0.030}],
661 { {fprof,apply_start_stop,4}, 0, 1691.076, 0.030}, %
662 [{{foo,create_file_slow,2}, 1, 1691.046, 0.103},
663 {suspend, 1, 0.000, 0.000}]}.
664
665 {[{{fprof,apply_start_stop,4}, 1, 1691.046, 0.103}],
666 { {foo,create_file_slow,2}, 1, 1691.046, 0.103}, %
667 [{{file,close,1}, 1, 1398.873, 0.019},
668 {{foo,create_file_slow,3}, 1, 249.678, 0.029},
669 {{file,open,2}, 1, 20.778, 0.055},
670 {{lists,map,2}, 1, 16.590, 0.043},
671 {{lists,seq,2}, 1, 4.708, 0.017},
672 {{file,write,2}, 1, 0.316, 0.021}]}.
673
674 The printout consists of one paragraph per called function. The func‐
675 tion marked with '%' is the one the paragraph concerns - foo:cre‐
676 ate_file_slow/2. Above the marked function are the calling functions -
677 those that has called the marked, and below are those called by the
678 marked function.
679
680 The paragraphs are per default sorted in decreasing order of the ACC
681 column for the marked function. The calling list and called list within
682 one paragraph are also per default sorted in decreasing order of their
683 ACC column.
684
685 The columns are: CNT - the number of times the function has been
686 called, ACC - the time spent in the function including called func‐
687 tions, and OWN - the time spent in the function not including called
688 functions.
689
690 The rows for the calling functions contain statistics for the marked
691 function with the constraint that only the occasions when a call was
692 made from the row's function to the marked function are accounted for.
693
694 The row for the marked function simply contains the sum of all calling
695 rows.
696
697 The rows for the called functions contains statistics for the row's
698 function with the constraint that only the occasions when a call was
699 made from the marked to the row's function are accounted for.
700
701 So, we see that foo:create_file_slow/2 used very little time for its
702 own execution. It spent most of its time in file:close/1. The function
703 foo:create_file_slow/3 that writes 3/4 of the file contents is the sec‐
704 ond biggest time thief.
705
706 We also see that the call to file:write/2 that writes 1/4 of the file
707 contents takes very little time in itself. What takes time is to build
708 the data (lists:seq/2 and lists:map/2).
709
710 The function 'undefined' that has called fprof:apply_start_stop/4 is an
711 unknown function because that call was not recorded in the trace. It
712 was only recorded that the execution returned from fprof:ap‐
713 ply_start_stop/4 to some other function above in the call stack, or
714 that the process exited from there.
715
716 Let us continue down the printout to find:
717
718 {[{{foo,create_file_slow,2}, 1, 249.678, 0.029},
719 {{foo,create_file_slow,3}, 768, 0.000, 23.294}],
720 { {foo,create_file_slow,3}, 769, 249.678, 23.323}, %
721 [{{file,write,2}, 768, 220.314, 14.539},
722 {suspend, 57, 6.041, 0.000},
723 {{foo,create_file_slow,3}, 768, 0.000, 23.294}]}.
724
725 If you compare with the code you will see there also that foo:cre‐
726 ate_file_slow/3 was called only from foo:create_file_slow/2 and itself,
727 and called only file:write/2, note the number of calls to file:write/2.
728 But here we see that suspend was called a few times. This is a pseudo
729 function that indicates that the process was suspended while executing
730 in foo:create_file_slow/3, and since there is no receive or er‐
731 lang:yield/0 in the code, it must be Erlang scheduling suspensions, or
732 the trace file driver compensating for large file write operations
733 (these are regarded as a schedule out followed by a schedule in to the
734 same process).
735
736 Let us find the suspend entry:
737
738 {[{{file,write,2}, 53, 6.281, 0.000},
739 {{foo,create_file_slow,3}, 57, 6.041, 0.000},
740 {{prim_file,drv_command,4}, 50, 4.582, 0.000},
741 {{prim_file,drv_get_response,1}, 34, 2.986, 0.000},
742 {{lists,map,2}, 10, 2.104, 0.000},
743 {{prim_file,write,2}, 17, 1.852, 0.000},
744 {{erlang,port_command,2}, 15, 1.713, 0.000},
745 {{prim_file,drv_command,2}, 22, 1.482, 0.000},
746 {{prim_file,translate_response,2}, 11, 1.441, 0.000},
747 {{prim_file,'-drv_command/2-fun-0-',1}, 15, 1.340, 0.000},
748 {{lists,seq,4}, 3, 0.880, 0.000},
749 {{foo,'-create_file_slow/2-fun-0-',1}, 5, 0.523, 0.000},
750 {{erlang,bump_reductions,1}, 4, 0.503, 0.000},
751 {{prim_file,open_int_setopts,3}, 1, 0.165, 0.000},
752 {{prim_file,i32,4}, 1, 0.109, 0.000},
753 {{fprof,apply_start_stop,4}, 1, 0.000, 0.000}],
754 { suspend, 299, 32.002, 0.000}, %
755 [ ]}.
756
757 We find no particulary long suspend times, so no function seems to have
758 waited in a receive statement. Actually, prim_file:drv_command/4 con‐
759 tains a receive statement, but in this test program, the message lies
760 in the process receive buffer when the receive statement is entered. We
761 also see that the total suspend time for the test run is small.
762
763 The suspend pseudo function has got an OWN time of zero. This is to
764 prevent the process total OWN time from including time in suspension.
765 Whether suspend time is really ACC or OWN time is more of a philosophi‐
766 cal question.
767
768 Now we look at another interesting pseudo function, garbage_collect:
769
770 {[{{prim_file,drv_command,4}, 25, 0.873, 0.873},
771 {{prim_file,write,2}, 16, 0.692, 0.692},
772 {{lists,map,2}, 2, 0.195, 0.195}],
773 { garbage_collect, 43, 1.760, 1.760}, %
774 [ ]}.
775
776 Here we see that no function distinguishes itself considerably, which
777 is very normal.
778
779 The garbage_collect pseudo function has not got an OWN time of zero
780 like suspend, instead it is equal to the ACC time.
781
782 Garbage collect often occurs while a process is suspended, but fprof
783 hides this fact by pretending that the suspended function was first un‐
784 suspended and then garbage collected. Otherwise the printout would show
785 garbage_collect being called from suspend but not which function that
786 might have caused the garbage collection.
787
788 Let us now get back to the test code:
789
790 {[{{foo,create_file_slow,3}, 768, 220.314, 14.539},
791 {{foo,create_file_slow,2}, 1, 0.316, 0.021}],
792 { {file,write,2}, 769, 220.630, 14.560}, %
793 [{{prim_file,write,2}, 769, 199.789, 22.573},
794 {suspend, 53, 6.281, 0.000}]}.
795
796 Not unexpectedly, we see that file:write/2 was called from foo:cre‐
797 ate_file_slow/3 and foo:create_file_slow/2. The number of calls in each
798 case as well as the used time are also just confirms the previous re‐
799 sults.
800
801 We see that file:write/2 only calls prim_file:write/2, but let us re‐
802 frain from digging into the internals of the kernel application.
803
804 But, if we nevertheless do dig down we find the call to the linked in
805 driver that does the file operations towards the host operating system:
806
807 {[{{prim_file,drv_command,4}, 772, 1458.356, 1456.643}],
808 { {erlang,port_command,2}, 772, 1458.356, 1456.643}, %
809 [{suspend, 15, 1.713, 0.000}]}.
810
811 This is 86 % of the total run time, and as we saw before it is the
812 close operation the absolutely biggest contributor. We find a compari‐
813 son ratio a little bit up in the call stack:
814
815 {[{{prim_file,close,1}, 1, 1398.748, 0.024},
816 {{prim_file,write,2}, 769, 174.672, 12.810},
817 {{prim_file,open_int,4}, 1, 19.755, 0.017},
818 {{prim_file,open_int_setopts,3}, 1, 0.147, 0.016}],
819 { {prim_file,drv_command,2}, 772, 1593.322, 12.867}, %
820 [{{prim_file,drv_command,4}, 772, 1578.973, 27.265},
821 {suspend, 22, 1.482, 0.000}]}.
822
823 The time for file operations in the linked in driver distributes itself
824 as 1 % for open, 11 % for write and 87 % for close. All data is proba‐
825 bly buffered in the operating system until the close.
826
827 The unsleeping reader may notice that the ACC times for
828 prim_file:drv_command/2 and prim_file:drv_command/4 is not equal be‐
829 tween the paragraphs above, even though it is easy to believe that
830 prim_file:drv_command/2 is just a passthrough function.
831
832 The missing time can be found in the paragraph for prim_file:drv_com‐
833 mand/4 where it is evident that not only prim_file:drv_command/2 is
834 called but also a fun:
835
836 {[{{prim_file,drv_command,2}, 772, 1578.973, 27.265}],
837 { {prim_file,drv_command,4}, 772, 1578.973, 27.265}, %
838 [{{erlang,port_command,2}, 772, 1458.356, 1456.643},
839 {{prim_file,'-drv_command/2-fun-0-',1}, 772, 87.897, 12.736},
840 {suspend, 50, 4.582, 0.000},
841 {garbage_collect, 25, 0.873, 0.873}]}.
842
843 And some more missing time can be explained by the fact that
844 prim_file:open_int/4 both calls prim_file:drv_command/2 directly as
845 well as through prim_file:open_int_setopts/3, which complicates the
846 picture.
847
848 {[{{prim_file,open,2}, 1, 20.309, 0.029},
849 {{prim_file,open_int,4}, 1, 0.000, 0.057}],
850 { {prim_file,open_int,4}, 2, 20.309, 0.086}, %
851 [{{prim_file,drv_command,2}, 1, 19.755, 0.017},
852 {{prim_file,open_int_setopts,3}, 1, 0.360, 0.032},
853 {{prim_file,drv_open,2}, 1, 0.071, 0.030},
854 {{erlang,list_to_binary,1}, 1, 0.020, 0.020},
855 {{prim_file,i32,1}, 1, 0.017, 0.017},
856 {{prim_file,open_int,4}, 1, 0.000, 0.057}]}.
857 {[{{prim_file,open_int,4}, 1, 0.360, 0.032},
858 {{prim_file,open_int_setopts,3}, 1, 0.000, 0.016}],
859 { {prim_file,open_int_setopts,3}, 2, 0.360, 0.048}, %
860 [{suspend, 1, 0.165, 0.000},
861 {{prim_file,drv_command,2}, 1, 0.147, 0.016},
862 {{prim_file,open_int_setopts,3}, 1, 0.000, 0.016}]}.
863
865 The actual supervision of execution times is in itself a CPU intensive
866 activity. A message is written on the trace file for every function
867 call that is made by the profiled code.
868
869 The ACC time calculation is sometimes difficult to make correct, since
870 it is difficult to define. This happens especially when a function oc‐
871 curs in several instances in the call stack, for example by calling it‐
872 self perhaps through other functions and perhaps even non-tail recur‐
873 sively.
874
875 To produce sensible results, fprof tries not to charge any function
876 more than once for ACC time. The instance highest up (with longest du‐
877 ration) in the call stack is chosen.
878
879 Sometimes a function may unexpectedly waste a lot (some 10 ms or more
880 depending on host machine OS) of OWN (and ACC) time, even functions
881 that do practically nothing at all. The problem may be that the OS has
882 chosen to schedule out the Erlang runtime system process for a while,
883 and if the OS does not support high resolution cpu time measurements
884 fprof will use wallclock time for its calculations, and it will appear
885 as functions randomly burn virtual machine time.
886
888 dbg(3), eprof(3), erlang(3), io(3), Tools User's Guide
889
890
891
892Ericsson AB tools 3.5.2 fprof(3)