DBI::Profile(3pm)

1DBI::Profile(3)       User Contributed Perl Documentation      DBI::Profile(3)
2
3
4

NAME

6       DBI::Profile - Performance profiling and benchmarking for the DBI
7

SYNOPSIS

9       The easiest way to enable DBI profiling is to set the DBI_PROFILE
10       environment variable to 2 and then run your code as usual:
11
12         DBI_PROFILE=2 prog.pl
13
14       This will profile your program and then output a textual summary
15       grouped by query when the program exits.  You can also enable profiling
16       by setting the Profile attribute of any DBI handle:
17
18         $dbh->{Profile} = 2;
19
20       Then the summary will be printed when the handle is destroyed.
21
22       Many other values apart from are possible - see "ENABLING A PROFILE"
23       below.
24

DESCRIPTION

26       The DBI::Profile module provides a simple interface to collect and
27       report performance and benchmarking data from the DBI.
28
29       For a more elaborate interface, suitable for larger programs, see
30       DBI::ProfileDumper and dbiprof.  For Apache/mod_perl applications see
31       DBI::ProfileDumper::Apache.
32

OVERVIEW

34       Performance data collection for the DBI is built around several
35       concepts which are important to understand clearly.
36
37       Method Dispatch
38           Every method call on a DBI handle passes through a single
39           'dispatch' function which manages all the common aspects of DBI
40           method calls, such as handling the RaiseError attribute.
41
42       Data Collection
43           If profiling is enabled for a handle then the dispatch code takes a
44           high-resolution timestamp soon after it is entered. Then, after
45           calling the appropriate method and just before returning, it takes
46           another high-resolution timestamp and calls a function to record
47           the information.  That function is passed the two timestamps plus
48           the DBI handle and the name of the method that was called.  That
49           data about a single DBI method call is called a profile sample.
50
51       Data Filtering
52           If the method call was invoked by the DBI or by a driver then the
53           call is ignored for profiling because the time spent will be
54           accounted for by the original 'outermost' call for your code.
55
56           For example, the calls that the selectrow_arrayref() method makes
57           to prepare() and execute() etc. are not counted individually
58           because the time spent in those methods is going to be allocated to
59           the selectrow_arrayref() method when it returns. If this was not
60           done then it would be very easy to double count time spent inside
61           the DBI.
62
63       Data Storage Tree
64           The profile data is accumulated as 'leaves on a tree'. The 'path'
65           through the branches of the tree to a particular leaf is determined
66           dynamically for each sample.  This is a key feature of DBI
67           profiling.
68
69           For each profiled method call the DBI walks along the Path and uses
70           each value in the Path to step into and grow the Data tree.
71
72           For example, if the Path is
73
74             [ 'foo', 'bar', 'baz' ]
75
76           then the new profile sample data will be merged into the tree at
77
78             $h->{Profile}->{Data}->{foo}->{bar}->{baz}
79
80           But it's not very useful to merge all the call data into one leaf
81           node (except to get an overall 'time spent inside the DBI' total).
82           It's more common to want the Path to include dynamic values such as
83           the current statement text and/or the name of the method called to
84           show what the time spent inside the DBI was for.
85
86           The Path can contain some 'magic cookie' values that are
87           automatically replaced by corresponding dynamic values when they're
88           used. These magic cookies always start with a punctuation
89           character.
90
91           For example a value of '"!MethodName"' in the Path causes the
92           corresponding entry in the Data to be the name of the method that
93           was called.  For example, if the Path was:
94
95             [ 'foo', '!MethodName', 'bar' ]
96
97           and the selectall_arrayref() method was called, then the profile
98           sample data for that call will be merged into the tree at:
99
100             $h->{Profile}->{Data}->{foo}->{selectall_arrayref}->{bar}
101
102       Profile Data
103           Profile data is stored at the 'leaves' of the tree as references to
104           an array of numeric values. For example:
105
106             [
107               106,                  # 0: count of samples at this node
108               0.0312958955764771,   # 1: total duration
109               0.000490069389343262, # 2: first duration
110               0.000176072120666504, # 3: shortest duration
111               0.00140702724456787,  # 4: longest duration
112               1023115819.83019,     # 5: time of first sample
113               1023115819.86576,     # 6: time of last sample
114             ]
115
116           After the first sample, later samples always update elements 0, 1,
117           and 6, and may update 3 or 4 depending on the duration of the
118           sampled call.
119

ENABLING A PROFILE

121       Profiling is enabled for a handle by assigning to the Profile
122       attribute. For example:
123
124         $h->{Profile} = DBI::Profile->new();
125
126       The Profile attribute holds a blessed reference to a hash object that
127       contains the profile data and attributes relating to it.
128
129       The class the Profile object is blessed into is expected to provide at
130       least a DESTROY method which will dump the profile data to the DBI
131       trace file handle (STDERR by default).
132
133       All these examples have the same effect as each other:
134
135         $h->{Profile} = 0;
136         $h->{Profile} = "/DBI::Profile";
137         $h->{Profile} = DBI::Profile->new();
138         $h->{Profile} = {};
139         $h->{Profile} = { Path => [] };
140
141       Similarly, these examples have the same effect as each other:
142
143         $h->{Profile} = 6;
144         $h->{Profile} = "6/DBI::Profile";
145         $h->{Profile} = "!Statement:!MethodName/DBI::Profile";
146         $h->{Profile} = { Path => [ '!Statement', '!MethodName' ] };
147
148       If a non-blessed hash reference is given then the DBI::Profile module
149       is automatically "require"'d and the reference is blessed into that
150       class.
151
152       If a string is given then it is processed like this:
153
154           ($path, $module, $args) = split /\//, $string, 3
155
156           @path = split /:/, $path
157           @args = split /:/, $args
158
159           eval "require $module" if $module
160           $module ||= "DBI::Profile"
161
162           $module->new( Path => \@Path, @args )
163
164       So the first value is used to select the Path to be used (see below).
165       The second value, if present, is used as the name of a module which
166       will be loaded and it's "new" method called. If not present it defaults
167       to DBI::Profile. Any other values are passed as arguments to the "new"
168       method. For example: ""2/DBIx::OtherProfile/Foo:42"".
169
170       Numbers can be used as a shorthand way to enable common Path values.
171       The simplest way to explain how the values are interpreted is to show
172       the code:
173
174           push @Path, "DBI"           if $path_elem & 0x01;
175           push @Path, "!Statement"    if $path_elem & 0x02;
176           push @Path, "!MethodName"   if $path_elem & 0x04;
177           push @Path, "!MethodClass"  if $path_elem & 0x08;
178           push @Path, "!Caller2"      if $path_elem & 0x10;
179
180       So "2" is the same as "!Statement" and "6" (2+4) is the same as
181       "!Statement:!Method".  Those are the two most commonly used values.
182       Using a negative number will reverse the path. Thus "-6" will group by
183       method name then statement.
184
185       The splitting and parsing of string values assigned to the Profile
186       attribute may seem a little odd, but there's a good reason for it.
187       Remember that attributes can be embedded in the Data Source Name string
188       which can be passed in to a script as a parameter. For example:
189
190           dbi:DriverName(Profile=>2):dbname
191           dbi:DriverName(Profile=>{Username}:!Statement/MyProfiler/Foo:42):dbname
192
193       And also, if the "DBI_PROFILE" environment variable is set then The DBI
194       arranges for every driver handle to share the same profile object. When
195       perl exits a single profile summary will be generated that reflects (as
196       nearly as practical) the total use of the DBI by the application.
197

THE PROFILE OBJECT

199       The DBI core expects the Profile attribute value to be a hash reference
200       and if the following values don't exist it will create them as needed:
201
202   Data
203       A reference to a hash containing the collected profile data.
204
205   Path
206       The Path value is a reference to an array. Each element controls the
207       value to use at the corresponding level of the profile Data tree.
208
209       If the value of Path is anything other than an array reference, it is
210       treated as if it was:
211
212               [ '!Statement' ]
213
214       The elements of Path array can be one of the following types:
215
216       Special Constant
217
218       !Statement
219
220       Use the current Statement text. Typically that's the value of the
221       Statement attribute for the handle the method was called with. Some
222       methods, like commit() and rollback(), are unrelated to a particular
223       statement. For those methods !Statement records an empty string.
224
225       For statement handles this is always simply the string that was given
226       to prepare() when the handle was created.  For database handles this is
227       the statement that was last prepared or executed on that database
228       handle. That can lead to a little 'fuzzyness' because, for example,
229       calls to the quote() method to build a new statement will typically be
230       associated with the previous statement. In practice this isn't a
231       significant issue and the dynamic Path mechanism can be used to setup
232       your own rules.
233
234       !MethodName
235
236       Use the name of the DBI method that the profile sample relates to.
237
238       !MethodClass
239
240       Use the fully qualified name of the DBI method, including the package,
241       that the profile sample relates to. This shows you where the method was
242       implemented. For example:
243
244         'DBD::_::db::selectrow_arrayref' =>
245             0.022902s
246         'DBD::mysql::db::selectrow_arrayref' =>
247             2.244521s / 99 = 0.022445s avg (first 0.022813s, min 0.022051s, max 0.028932s)
248
249       The "DBD::_::db::selectrow_arrayref" shows that the driver has
250       inherited the selectrow_arrayref method provided by the DBI.
251
252       But you'll note that there is only one call to
253       DBD::_::db::selectrow_arrayref but another 99 to
254       DBD::mysql::db::selectrow_arrayref. Currently the first call doesn't
255       record the true location. That may change.
256
257       !Caller
258
259       Use a string showing the filename and line number of the code calling
260       the method.
261
262       !Caller2
263
264       Use a string showing the filename and line number of the code calling
265       the method, as for !Caller, but also include filename and line number
266       of the code that called that. Calls from DBI:: and DBD:: packages are
267       skipped.
268
269       !File
270
271       Same as !Caller above except that only the filename is included, not
272       the line number.
273
274       !File2
275
276       Same as !Caller2 above except that only the filenames are included, not
277       the line number.
278
279       !Time
280
281       Use the current value of time(). Rarely used. See the more useful
282       "!Time~N" below.
283
284       !Time~N
285
286       Where "N" is an integer. Use the current value of time() but with
287       reduced precision.  The value used is determined in this way:
288
289           int( time() / N ) * N
290
291       This is a useful way to segregate a profile into time slots. For
292       example:
293
294           [ '!Time~60', '!Statement' ]
295
296       Code Reference
297
298       The subroutine is passed the handle it was called on and the DBI method
299       name.  The current Statement is in $_. The statement string should not
300       be modified, so most subs start with "local $_ = $_;".
301
302       The list of values it returns is used at that point in the Profile
303       Path.  Any undefined values are treated as the string ""undef"".
304
305       The sub can 'veto' (reject) a profile sample by including a reference
306       to undef ("\undef") in the returned list. That can be useful when you
307       want to only profile statements that match a certain pattern, or only
308       profile certain methods.
309
310       Subroutine Specifier
311
312       A Path element that begins with '"&"' is treated as the name of a
313       subroutine in the DBI::ProfileSubs namespace and replaced with the
314       corresponding code reference.
315
316       Currently this only works when the Path is specified by the
317       "DBI_PROFILE" environment variable.
318
319       Also, currently, the only subroutine in the DBI::ProfileSubs namespace
320       is '&norm_std_n3'. That's a very handy subroutine when profiling code
321       that doesn't use placeholders. See DBI::ProfileSubs for more
322       information.
323
324       Attribute Specifier
325
326       A string enclosed in braces, such as '"{Username}"', specifies that the
327       current value of the corresponding database handle attribute should be
328       used at that point in the Path.
329
330       Reference to a Scalar
331
332       Specifies that the current value of the referenced scalar be used at
333       that point in the Path.  This provides an efficient way to get
334       'contextual' values into your profile.
335
336       Other Values
337
338       Any other values are stringified and used literally.
339
340       (References, and values that begin with punctuation characters are
341       reserved.)
342

REPORTING

344   Report Format
345       The current accumulated profile data can be formatted and output using
346
347           print $h->{Profile}->format;
348
349       To discard the profile data and start collecting fresh data you can do:
350
351           $h->{Profile}->{Data} = undef;
352
353       The default results format looks like this:
354
355         DBI::Profile: 0.001015s 42.7% (5 calls) programname @ YYYY-MM-DD HH:MM:SS
356         '' =>
357             0.000024s / 2 = 0.000012s avg (first 0.000015s, min 0.000009s, max 0.000015s)
358         'SELECT mode,size,name FROM table' =>
359             0.000991s / 3 = 0.000330s avg (first 0.000678s, min 0.000009s, max 0.000678s)
360
361       Which shows the total time spent inside the DBI, with a count of the
362       total number of method calls and the name of the script being run, then
363       a formatted version of the profile data tree.
364
365       If the results are being formatted when the perl process is exiting
366       (which is usually the case when the DBI_PROFILE environment variable is
367       used) then the percentage of time the process spent inside the DBI is
368       also shown. If the process is not exiting then the percentage is
369       calculated using the time between the first and last call to the DBI.
370
371       In the example above the paths in the tree are only one level deep and
372       use the Statement text as the value (that's the default behaviour).
373
374       The merged profile data at the 'leaves' of the tree are presented as
375       total time spent, count, average time spent (which is simply total time
376       divided by the count), then the time spent on the first call, the time
377       spent on the fastest call, and finally the time spent on the slowest
378       call.
379
380       The 'avg', 'first', 'min' and 'max' times are not particularly useful
381       when the profile data path only contains the statement text.  Here's an
382       extract of a more detailed example using both statement text and method
383       name in the path:
384
385         'SELECT mode,size,name FROM table' =>
386             'FETCH' =>
387                 0.000076s
388             'fetchrow_hashref' =>
389                 0.036203s / 108 = 0.000335s avg (first 0.000490s, min 0.000152s, max 0.002786s)
390
391       Here you can see the 'avg', 'first', 'min' and 'max' for the 108 calls
392       to fetchrow_hashref() become rather more interesting.  Also the data
393       for FETCH just shows a time value because it was only called once.
394
395       Currently the profile data is output sorted by branch names. That may
396       change in a later version so the leaf nodes are sorted by total time
397       per leaf node.
398
399   Report Destination
400       The default method of reporting is for the DESTROY method of the
401       Profile object to format the results and write them using:
402
403           DBI->trace_msg($results, 0);  # see $ON_DESTROY_DUMP below
404
405       to write them to the DBI trace() filehandle (which defaults to STDERR).
406       To direct the DBI trace filehandle to write to a file without enabling
407       tracing the trace() method can be called with a trace level of 0. For
408       example:
409
410           DBI->trace(0, $filename);
411
412       The same effect can be achieved without changing the code by setting
413       the "DBI_TRACE" environment variable to "0=filename".
414
415       The $DBI::Profile::ON_DESTROY_DUMP variable holds a code ref that's
416       called to perform the output of the formatted results.  The default
417       value is:
418
419         $ON_DESTROY_DUMP = sub { DBI->trace_msg($results, 0) };
420
421       Apart from making it easy to send the dump elsewhere, it can also be
422       useful as a simple way to disable dumping results.
423

CHILD HANDLES

425       Child handles inherit a reference to the Profile attribute value of
426       their parent.  So if profiling is enabled for a database handle then by
427       default the statement handles created from it all contribute to the
428       same merged profile data tree.
429

PROFILE OBJECT METHODS

431   format
432       See "REPORTING".
433
434   as_node_path_list
435         @ary = $dbh->{Profile}->as_node_path_list();
436         @ary = $dbh->{Profile}->as_node_path_list($node, $path);
437
438       Returns the collected data ($dbh->{Profile}{Data}) restructured into a
439       list of array refs, one for each leaf node in the Data tree. This
440       'flat' structure is often much simpler for applications to work with.
441
442       The first element of each array ref is a reference to the leaf node.
443       The remaining elements are the 'path' through the data tree to that
444       node.
445
446       For example, given a data tree like this:
447
448           {key1a}{key2a}[node1]
449           {key1a}{key2b}[node2]
450           {key1b}{key2a}{key3a}[node3]
451
452       The as_node_path_list() method  will return this list:
453
454           [ [node1], 'key1a', 'key2a' ]
455           [ [node2], 'key1a', 'key2b' ]
456           [ [node3], 'key1b', 'key2a', 'key3a' ]
457
458       The nodes are ordered by key, depth-first.
459
460       The $node argument can be used to focus on a sub-tree.  If not
461       specified it defaults to $dbh->{Profile}{Data}.
462
463       The $path argument can be used to specify a list of path elements that
464       will be added to each element of the returned list. If not specified it
465       defaults to a ref to an empty array.
466
467   as_text
468         @txt = $dbh->{Profile}->as_text();
469         $txt = $dbh->{Profile}->as_text({
470             node      => undef,
471             path      => [],
472             separator => " > ",
473             format    => '%1$s: %11$fs / %10$d = %2$fs avg (first %12$fs, min %13$fs, max %14$fs)'."\n";
474             sortsub   => sub { ... },
475         );
476
477       Returns the collected data ($dbh->{Profile}{Data}) reformatted into a
478       list of formatted strings.  In scalar context the list is returned as a
479       single concatenated string.
480
481       A hashref can be used to pass in arguments, the default values are
482       shown in the example above.
483
484       The "node" and <path> arguments are passed to as_node_path_list().
485
486       The "separator" argument is used to join the elements of the path for
487       each leaf node.
488
489       The "sortsub" argument is used to pass in a ref to a sub that will
490       order the list.  The subroutine will be passed a reference to the array
491       returned by as_node_path_list() and should sort the contents of the
492       array in place.  The return value from the sub is ignored. For example,
493       to sort the nodes by the second level key you could use:
494
495         sortsub => sub { my $ary=shift; @$ary = sort { $a->[2] cmp $b->[2] } @$ary }
496
497       The "format" argument is a "sprintf" format string that specifies the
498       format to use for each leaf node.  It uses the explicit format
499       parameter index mechanism to specify which of the arguments should
500       appear where in the string.  The arguments to sprintf are:
501
502            1:  path to node, joined with the separator
503            2:  average duration (total duration/count)
504                (3 thru 9 are currently unused)
505           10:  count
506           11:  total duration
507           12:  first duration
508           13:  smallest duration
509           14:  largest duration
510           15:  time of first call
511           16:  time of first call
512

CUSTOM DATA MANIPULATION

514       Recall that "$h->{Profile}->{Data}" is a reference to the collected
515       data.  Either to a 'leaf' array (when the Path is empty, i.e.,
516       DBI_PROFILE env var is 1), or a reference to hash containing values
517       that are either further hash references or leaf array references.
518
519       Sometimes it's useful to be able to summarise some or all of the
520       collected data.  The dbi_profile_merge_nodes() function can be used to
521       merge leaf node values.
522
523   dbi_profile_merge_nodes
524         use DBI qw(dbi_profile_merge_nodes);
525
526         $time_in_dbi = dbi_profile_merge_nodes(my $totals=[], @$leaves);
527
528       Merges profile data node. Given a reference to a destination array, and
529       zero or more references to profile data, merges the profile data into
530       the destination array.  For example:
531
532         $time_in_dbi = dbi_profile_merge_nodes(
533             my $totals=[],
534             [ 10, 0.51, 0.11, 0.01, 0.22, 1023110000, 1023110010 ],
535             [ 15, 0.42, 0.12, 0.02, 0.23, 1023110005, 1023110009 ],
536         );
537
538       $totals will then contain
539
540         [ 25, 0.93, 0.11, 0.01, 0.23, 1023110000, 1023110010 ]
541
542       and $time_in_dbi will be 0.93;
543
544       The second argument need not be just leaf nodes. If given a reference
545       to a hash then the hash is recursively searched for leaf nodes and all
546       those found are merged.
547
548       For example, to get the time spent 'inside' the DBI during an http
549       request, your logging code run at the end of the request (i.e. mod_perl
550       LogHandler) could use:
551
552         my $time_in_dbi = 0;
553         if (my $Profile = $dbh->{Profile}) { # if DBI profiling is enabled
554             $time_in_dbi = dbi_profile_merge_nodes(my $total=[], $Profile->{Data});
555             $Profile->{Data} = {}; # reset the profile data
556         }
557
558       If profiling has been enabled then $time_in_dbi will hold the time
559       spent inside the DBI for that handle (and any other handles that share
560       the same profile data) since the last request.
561
562       Prior to DBI 1.56 the dbi_profile_merge_nodes() function was called
563       dbi_profile_merge().  That name still exists as an alias.
564

CUSTOM DATA COLLECTION

566   Using The Path Attribute
567         XXX example to be added later using a selectall_arrayref call
568         XXX nested inside a fetch loop where the first column of the
569         XXX outer loop is bound to the profile Path using
570         XXX bind_column(1, \${ $dbh->{Profile}->{Path}->[0] })
571         XXX so you end up with separate profiles for each loop
572         XXX (patches welcome to add this to the docs :)
573
574   Adding Your Own Samples
575       The dbi_profile() function can be used to add extra sample data into
576       the profile data tree. For example:
577
578           use DBI;
579           use DBI::Profile (dbi_profile dbi_time);
580
581           my $t1 = dbi_time(); # floating point high-resolution time
582
583           ... execute code you want to profile here ...
584
585           my $t2 = dbi_time();
586           dbi_profile($h, $statement, $method, $t1, $t2);
587
588       The $h parameter is the handle the extra profile sample should be
589       associated with. The $statement parameter is the string to use where
590       the Path specifies !Statement. If $statement is undef then
591       $h->{Statement} will be used. Similarly $method is the string to use if
592       the Path specifies !MethodName. There is no default value for $method.
593
594       The $h->{Profile}{Path} attribute is processed by dbi_profile() in the
595       usual way.
596
597       The $h parameter is usually a DBI handle but it can also be a reference
598       to a hash, in which case the dbi_profile() acts on each defined value
599       in the hash.  This is an efficient way to update multiple profiles with
600       a single sample, and is used by the DashProfiler module.
601

SUBCLASSING

603       Alternate profile modules must subclass DBI::Profile to help ensure
604       they work with future versions of the DBI.
605

CAVEATS

607       Applications which generate many different statement strings (typically
608       because they don't use placeholders) and profile with !Statement in the
609       Path (the default) will consume memory in the Profile Data structure
610       for each statement. Use a code ref in the Path to return an edited
611       (simplified) form of the statement.
612
613       If a method throws an exception itself (not via RaiseError) then it
614       won't be counted in the profile.
615
616       If a HandleError subroutine throws an exception (rather than returning
617       0 and letting RaiseError do it) then the method call won't be counted
618       in the profile.
619
620       Time spent in DESTROY is added to the profile of the parent handle.
621
622       Time spent in DBI->*() methods is not counted. The time spent in the
623       driver connect method, $drh->connect(), when it's called by
624       DBI->connect is counted if the DBI_PROFILE environment variable is set.
625
626       Time spent fetching tied variables, $DBI::errstr, is counted.
627
628       Time spent in FETCH for $h->{Profile} is not counted, so getting the
629       profile data doesn't alter it.
630
631       DBI::PurePerl does not support profiling (though it could in theory).
632
633       For asynchronous queries, time spent while the query is running on the
634       backend is not counted.
635
636       A few platforms don't support the gettimeofday() high resolution time
637       function used by the DBI (and available via the dbi_time() function).
638       In which case you'll get integer resolution time which is mostly
639       useless.
640
641       On Windows platforms the dbi_time() function is limited to millisecond
642       resolution. Which isn't sufficiently fine for our needs, but still much
643       better than integer resolution. This limited resolution means that fast
644       method calls will often register as taking 0 time. And timings in
645       general will have much more 'jitter' depending on where within the
646       'current millisecond' the start and end timing was taken.
647
648       This documentation could be more clear. Probably needs to be reordered
649       to start with several examples and build from there.  Trying to explain
650       the concepts first seems painful and to lead to just as many forward
651       references.  (Patches welcome!)
652
653
654
655perl v5.32.1                      2021-01-27                   DBI::Profile(3)