1mprof-report(1)             General Commands Manual            mprof-report(1)
2
3
4

The Mono log profiler

6       The Mono log profiler can be used to collect a lot of information about
7       a program running in the Mono runtime.  This data  can  be  used  (both
8       while  the  process is running and later) to do analyses of the program
9       behaviour, determine resource usage, performance issues  or  even  look
10       for particular execution patterns.
11
12       This is accomplished by logging the events provided by the Mono runtime
13       through the profiling interface and periodically writing them to a file
14       which can be later inspected with the command line mprof-report program
15       or with a GUI (not developed yet).
16
17       The events collected include (among others):
18
19       · method enter and leave
20
21       · object allocation
22
23       · garbage collection
24
25       · JIT compilation
26
27       · metadata loading
28
29       · lock contention
30
31       · exceptions
32
33       In addition, the profiler can periodically collect info about  all  the
34       objects present in the heap at the end of a garbage collection (this is
35       called heap shot and currently implemented only for  the  sgen  garbage
36       collector).  Another available profiler mode is the sampling or statis‐
37       tical mode: periodically the program is  sampled  and  the  information
38       about  what  the  program  was  busy with is saved.  This allows to get
39       information about the program behaviour without degrading  its  perfor‐
40       mance too much (usually less than 10%).
41
42   Basic profiler usage
43       The simpler way to use the profiler is the following:
44
45       mono --profile=log program.exe
46
47       At  the  end of the execution the file output.mlpd will be found in the
48       current directory.  A summary report of the data can be printed by run‐
49       ning:
50
51       mprof-report output.mlpd
52
53       With  this invocation a huge amount of data is collected about the pro‐
54       gram execution and collecting and saving this  data  can  significantly
55       slow  down  program  execution.   If  saving  the profiling data is not
56       needed, a report can be generated directly with:
57
58       mono --profile=log:report program.exe
59
60       If the information about allocations is not  of  interest,  it  can  be
61       excluded:
62
63       mono --profile=log:noalloc program.exe
64
65       On  the other hand, if method call timing is not important, while allo‐
66       cations are, the needed info can be gathered with:
67
68       mono --profile=log:nocalls program.exe
69
70       You will still be able to inspect information  about  the  sequence  of
71       calls  that lead to each allocation because at each object allocation a
72       stack trace is collected if full enter/leave information is not  avail‐
73       able.
74
75       To  periodically  collect heap shots (and exclude method and allocation
76       events) use the following options (making sure you run  with  the  sgen
77       garbage collector):
78
79       mono --gc=sgen --profile=log:heapshot program.exe
80
81       To perform a sampling profiler run, use the sample option:
82
83       mono --profile=log:sample program.exe
84
85   Profiler option documentation
86       By  default the log profiler will gather all the events provided by the
87       Mono runtime and write them to  a  file  named  output.mlpd.   When  no
88       option is specified, it is equivalent to using:
89
90       --profile=log:calls,alloc,output=output.mlpd,maxframes=32,calldepth=100
91
92       The  following  options  can  be used to modify this default behaviour.
93       Each option is separated from the next by a , character, with no spaces
94       and  all  the options are included after the log: profile module speci‐
95       fier.
96
97       · help: display concise help info about each available option
98
99       · [no]alloc: noalloc disables collecting object allocation info,  alloc
100         enables it if it was disabled by another option like heapshot.
101
102       · [no]calls: nocalls disables collecting method enter and leave events.
103         When this option is used at each object allocation and at some  other
104         events  (like lock contentions and exception throws) a stack trace is
105         collected by default.  See the maxframes option to control  this  be‐
106         haviour.   calls  enables method enter/leave events if they were dis‐
107         abled by another option like heapshot.
108
109       · heapshot[=MODE]: collect heap shot data  at  each  major  collection.
110         The  frequency of the heap shots can be changed with the MODE parame‐
111         ter.   When  this  option  is  used  allocation  events  and   method
112         enter/leave  events  are not recorded by default: if they are needed,
113         they need to be enabled explicitly.  The optional parameter MODE  can
114         modify  the default heap shot frequency.  heapshot can be used multi‐
115         ple times with different modes: in that case a heap shot is taken  if
116         either of the conditions are met.  MODE can be one of:
117
118         · NUMms:  perform  a  heap  shot  if at least NUM milliseconds passed
119           since the last one.
120
121         · NUMgc: perform a heap shot every NUM major garbage collections
122
123         · ondemand: perform a heap shot when such a command is  sent  to  the
124           control port
125
126       · sample[=FREQ]:  collect statistical samples of the program behaviour.
127         The default is to collect  a  100  times  per  second  (100  Hz)  the
128         instruction pointer.  This is equivalent to the value “100”.  A value
129         of zero for FREQ effectively disables sampling.
130
131       · maxframes=NUM: when a stack trace needs to be performed, collect  NUM
132         frames at the most.  The default is 32.
133
134       · maxsamples=NUM:  stop  allocating  reusable  sample  events  once NUM
135         events have been allocated (a value of zero for all intents and  pur‐
136         poses  means unlimited). By default, the value of this setting is the
137         number of CPU cores multiplied by 1000. This is usually a good enough
138         value for typical desktop and mobile apps.  If you're losing too many
139         samples due to this default (which is possible in apps with an unusu‐
140         ally  high amount of threads), you may want to tinker with this value
141         to find a good balance between sample hit rate and performance impact
142         on  the  app. The way it works is that sample events are enqueued for
143         reuse after they're flushed to the output file; if a  thread  gets  a
144         sampling signal but there are no sample events in the reuse queue and
145         the profiler has reached the maximum number  of  sample  allocations,
146         the  sample  gets  dropped.  So a higher number for this setting will
147         increase the chance that a thread is able to collect  a  sample,  but
148         also  necessarily means that there will be more work done by the pro‐
149         filer. You can run Mono with the --stats  option  to  see  statistics
150         about sample events.
151
152       · calldepth=NUM:  ignore  method enter/leave events when the call chain
153         depth is bigger than NUM.
154
155       · zip: automatically compress the output data in gzip format.
156
157       · output=OUTSPEC: instead of writing the profiling  data  to  the  out‐
158         put.mlpd  file,  substitute %p in OUTSPEC with the current process id
159         and %t with the current date and time, then do according to OUTSPEC:
160
161         · if OUTSPEC begins with a | character, execute the rest as a program
162           and feed the data to its standard input
163
164         · if  OUTSPEC  begins  with a - character, use the rest of OUTSPEC as
165           the filename, but force overwrite any existing file by that name
166
167         · otherwise write the data the the named file: note that is a file by
168           that name already exists, a warning is issued and profiling is dis‐
169           abled.
170
171       · report: the profiling data is sent to mprof-report, which will  print
172         a  summary  report.   This is equivalent to the option: output=mprof-
173         report -.  If the output option is specified as well, the report will
174         be written to the output file instead of the console.
175
176       · port=PORT:  specify  the tcp/ip port to use for the listening command
177         server.  Currently not available for windows.  This server is started
178         for  example  when  heapshot=ondemand  is used: it will read commands
179         line by line.  The following commands are available:
180
181         · heapshot: perform a heapshot as soon as possible
182
183       · nocounters: disables sampling of runtime  and  performance  counters,
184         which is normally done every 1 second.
185
186       · coverage: collect code coverage data. This implies enabling the calls
187         option.
188
189       · onlycoverage: can only be used  with  coverage.  This  disables  most
190         other events so that the profiler mostly only collects coverage data.
191
192   Analyzing the profile data
193       Currently there is a command line program (mprof-report) to analyze the
194       data produced by the profiler.  This  is  ran  automatically  when  the
195       report profiler option is used.  Simply run:
196
197       mprof-report output.mlpd
198
199       to see a summary report of the data included in the file.
200
201   Trace information for events
202       Often  it  is  important  for  some events, like allocations, lock con‐
203       tention and exception throws to know where they happened.   Or  we  may
204       want to see what sequence of calls leads to a particular method invoca‐
205       tion.  To see this info invoke mprof-report as follows:
206
207       mprof-report --traces output.mlpd
208
209       The maximum number of methods in each stack trace can be specified with
210       the --maxframes=NUM option:
211
212       mprof-report --traces --maxframes=4 output.mlpd
213
214       The  stack  trace  info  will be available if method enter/leave events
215       have been recorded or if stack trace collection wasn't explicitly  dis‐
216       abled with the maxframes=0 profiler option.
217
218       The  --traces option also controls the reverse reference feature in the
219       heapshot report: for each class  it  reports  how  many  references  to
220       objects of that class come from other classes.
221
222   Sort order for methods and allocations
223       When  a  list  of methods is printed the default sort order is based on
224       the total time spent in the method.  This time is wall clock time (that
225       is,  it  includes the time spent, for example, in a sleep call, even if
226       actual cpu time would be basically 0).  Also, if the  method  has  been
227       ran  on  different  threads, the time will be a sum of the time used in
228       each thread.
229
230       To change the sort order, use the option:
231
232       --method-sort=MODE
233
234       where MODE can be:
235
236       · self: amount of time spent in  the  method  itself  and  not  in  its
237         callees
238
239       · calls: the number of method invocations
240
241       · total: the total time spent in the method.
242
243       Object  allocation  lists  are sorted by default depending on the total
244       amount of bytes used by each type.
245
246       To change the sort order of object allocations, use the option:
247
248       --alloc-sort=MODE
249
250       where MODE can be:
251
252       · count: the number of allocated objects of the given type
253
254       · bytes: the total number of bytes used by objects of the given type
255
256       To change the sort order of counters, use the option:
257
258       --counters-sort=MODE
259
260       where MODE can be:
261
262       · time: sort values by time then category
263
264       · category: sort values by category then time
265
266   Selecting what data to report
267       The profiler by default collects data about many runtime subsystems and
268       mprof-report  prints  a summary of all the subsystems that are found in
269       the data file.  It is possible to tell mprof-report to only show infor‐
270       mation about some of them with the following option:
271
272       --reports=R1[,R2...]
273
274       where the report names R1, R2 etc.  can be:
275
276       · header: information about program startup and profiler version
277
278       · jit: JIT compiler information
279
280       · sample: statistical sampling information
281
282       · gc: garbage collection information
283
284       · alloc: object allocation information
285
286       · call: method profiling information
287
288       · metadata: metadata events like image loads
289
290       · exception: exception throw and handling information
291
292       · monitor: lock contention information
293
294       · thread: thread information
295
296       · domain: app domain information
297
298       · context: remoting context information
299
300       · heapshot: live heap usage at heap shots
301
302       · counters: counters samples
303
304       · coverage: code coverage data
305
306       · stats: event statistics
307
308       It  is  possible  to limit some of the data displayed to a timeframe of
309       the program execution with the option:
310
311       --time=FROM-TO
312
313       where FROM and TO are seconds since application startup  (they  can  be
314       floating point numbers).
315
316       Another  interesting  option  is to consider only events happening on a
317       particular thread with the following option:
318
319       --thread=THREADID
320
321       where THREADID is one of the  numbers  listed  in  the  thread  summary
322       report (or a thread name when present).
323
324       By default long lists of methods or other information like object allo‐
325       cations are limited to the most important data.  To increase the amount
326       of information printed you can use the option:
327
328       --verbose
329
330   Track individual objects
331       Instead  of  printing  the  usual reports from the profiler data, it is
332       possible to track some  interesting  information  about  some  specific
333       object addresses.  The objects are selected based on their address with
334       the --track option as follows:
335
336       --track=0xaddr1[,0xaddr2,...]
337
338       The reported info (if available in the data file), will be class  name,
339       size,  creation  time,  stack  trace  of  creation  (with  the --traces
340       option), etc.  If heapshot data is available it  will  be  possible  to
341       also track what other objects reference one of the listed addresses.
342
343       The object addresses can be gathered either from the profiler report in
344       some cases (like in the monitor lock report), from the live application
345       or  they can be selected with the --find=FINDSPEC option.  FINDSPEC can
346       be one of the following:
347
348       · S:SIZE: where the object is selected if its size is at least SIZE
349
350       · T:NAME: where the object is selected if NAME  partially  matches  its
351         class name
352
353       This  option  can be specified multiple times with one of the different
354       kinds of FINDSPEC.  For example, the following:
355
356       --find=S:10000 --find=T:Byte[]
357
358       will find all the byte arrays that are at least 10000 bytes in size.
359
360       Note that with a  moving  garbage  collector  the  object  address  can
361       change,  so you may need to track the changed address manually.  It can
362       also happen that multiple objects are allocated at the same address, so
363       the output from this option can become large.
364
365   Saving a profiler report
366       By default mprof-report will print the summary data to the console.  To
367       print it to a file, instead, use the option:
368
369       --out=FILENAME
370
371   Processing code coverage data
372       If you ran the profiler with the coverage option, you can  process  the
373       collected  coverage  data into an XML file by running mprof-report like
374       this:
375
376       mprof-report --coverage-out=coverage.xml output.mlpd
377
378   Dealing with profiler slowness
379       If the profiler needs to collect lots of data,  the  execution  of  the
380       program  will  slow  down significantly, usually 10 to 20 times slower.
381       There are several ways to reduce the impact of the profiler on the pro‐
382       gram execution.
383
384       Use the statistical sampling mode
385
386           Statistical  sampling allows executing a program under the profiler
387           with minimal performance overhead (usually less  than  10%).   This
388           mode allows checking where the program is spending most of its exe‐
389           cution time without significantly perturbing its behaviour.
390
391       Collect less data
392
393           Collecting method enter/leave events can be very  expensive,  espe‐
394           cially  in  programs that perform many millions of tiny calls.  The
395           profiler option nocalls can be used to avoid collecting  this  data
396           or  it  can be limited to only a few call levels with the calldepth
397           option.
398
399           Object allocation information is expensive  as  well,  though  much
400           less than method enter/leave events.  If it's not needed, it can be
401           skipped with the noalloc profiler option.  Note  that  when  method
402           enter/leave  events are discarded, by default stack traces are col‐
403           lected at each allocation and this can be expensive as  well.   The
404           impact  of  stack trace information can be reduced by setting a low
405           value with the maxframes option or by eliminating them  completely,
406           by setting it to 0.
407
408           The  other  major  source  of data is the heapshot profiler option:
409           especially if the managed heap is big, since every object needs  to
410           be  inspected.   The  MODE  parameter of the heapshot option can be
411           used to reduce the frequency of the heap shots.
412
413   Dealing with the size of the data files
414       When collecting a lot of information about  a  profiled  program,  huge
415       data  files  can  be  generated.   There are a few ways to minimize the
416       amount of data, for example by not collecting some of the  more  space-
417       consuming  information  or by compressing the information on the fly or
418       by just generating a summary report.
419
420       Reducing the amount of data
421
422           Method enter/leave events  can  be  excluded  completely  with  the
423           nocalls option or they can be limited to just a few levels of calls
424           with the calldepth option.  For example, the option:
425
426           calldepth=10
427
428           will ignore the method events when there are more than  10  managed
429           stack  frames.   This  is  very  useful for programs that have deep
430           recursion or for programs that perform many millions of tiny  calls
431           deep  enough  in  the  call  stack.   The  optimal  number  for the
432           calldepth option depends on the program and it needs to be balanced
433           between  providing  enough  profiling information and allowing fast
434           execution speed.
435
436           Note that by default, if method events are not recorded at all, the
437           profiler  will collect stack trace information at events like allo‐
438           cations.  To avoid gathering this data, use  the  maxframes=0  pro‐
439           filer option.
440
441           Allocation events can be eliminated with the noalloc option.
442
443           Heap shot data can also be huge: by default it is collected at each
444           major collection.  To reduce the frequency, you can specify a heap‐
445           shot  mode:  for  example to collect every 5 collections (including
446           major and minor):
447
448           heapshot=5gc
449
450           or when at least 5 seconds passed since the last heap shot:
451
452           heapshot=5000ms
453
454       Compressing the data
455
456           To reduce the amout of disk space used by the data, the data can be
457           compressed  either  after  it has been generated with the gzip com‐
458           mand:
459
460           gzip -9 output.mlpd
461
462           or it can be compressed automatically by  using  the  zip  profiler
463           option.   Note that in this case there could be a significant slow‐
464           down of the profiled program.
465
466           The mprof-report program will tranparently deal  with  either  com‐
467           pressed or uncompressed data files.
468
469       Generating only a summary report
470
471           Often  it's  enough to look at the profiler summary report to diag‐
472           nose an issue and in this case it's possible to  avoid  saving  the
473           profiler  data  file  to  disk.   This can be accomplished with the
474           report profiler option, which will basically send the data  to  the
475           mprof-report program for display.
476
477           To have more control of what summary information is reported (or to
478           use a completely different program to decode  the  profiler  data),
479           the output profiler option can be used, with | as the first charac‐
480           ter: the rest of the output name will be executed as a program with
481           the data fed in on the standard input.
482
483           For  example,  to  print  only the Monitor summary with stack trace
484           information, you could use it like this:
485
486           output=|mprof-report --reports=monitor --traces -
487

WEB SITE

489       http://www.mono-project.com/docs/debug+profile/profile/profiler/
490

SEE ALSO

492       mono(1)
493

AUTHORS

495       Paolo Molaro, Alex Rønne Petersen
496
497
498
499                                                               mprof-report(1)
Impressum