Statistics::Contingency(3pm)

1Statistics::ContingencyU(s3e)r Contributed Perl DocumentaSttiaotnistics::Contingency(3)
2
3
4

NAME

6       Statistics::Contingency - Calculate precision, recall, F1, accuracy,
7       etc.
8

VERSION

10       version 0.09
11

SYNOPSIS

13        use Statistics::Contingency;
14        my $s = new Statistics::Contingency(categories => \@all_categories);
15
16        while (...something...) {
17          ...
18          $s->add_result($assigned_categories, $correct_categories);
19        }
20
21        print "Micro F1: ", $s->micro_F1, "\n"; # Access a single statistic
22        print $s->stats_table; # Show several stats in table form
23

DESCRIPTION

25       The "Statistics::Contingency" class helps you calculate several useful
26       statistical measures based on 2x2 "contingency tables".  I use these
27       measures to help judge the results of automatic text categorization
28       experiments, but they are useful in other situations as well.
29
30       The general usage flow is to tally a whole bunch of results in the
31       "Statistics::Contingency" object, then query that object to obtain the
32       measures you are interested in.  When all results have been collected,
33       you can get a report on accuracy, precision, recall, F1, and so on,
34       with both macro-averaging and micro-averaging over categories.
35
36   Macro vs. Micro Statistics
37       All of the statistics offered by this module can be calculated for each
38       category and then averaged, or can be calculated over all decisions and
39       then averaged.  The former is called macro-averaging (specifically,
40       macro-averaging with respect to category), and the latter is called
41       micro-averaging.  The two procedures bias the results differently -
42       micro-averaging tends to over-emphasize the performance on the largest
43       categories, while macro-averaging over-emphasizes the performance on
44       the smallest.  It's often best to look at both of them to get a good
45       idea of how your data distributes across categories.
46
47   Statistics available
48       All of the statistics are calculated based on a so-called "contingency
49       table", which looks like this:
50
51                     Correct=Y   Correct=N
52                   +-----------+-----------+
53        Assigned=Y |     a     |     b     |
54                   +-----------+-----------+
55        Assigned=N |     c     |     d     |
56                   +-----------+-----------+
57
58       a, b, c, and d are counts that reflect how the assigned categories
59       matched the correct categories.  Depending on whether a macro-statistic
60       or a micro-statistic is being calculated, these numbers will be tallied
61       per-category or for the entire result set.
62
63       The following statistics are available:
64
65       ·   accuracy
66
67           This measures the portion of all decisions that were correct
68           decisions.  It is defined as "(a+d)/(a+b+c+d)".  It falls in the
69           range from 0 to 1, with 1 being the best score.
70
71           Note that macro-accuracy and micro-accuracy will always give the
72           same number.
73
74       ·   error
75
76           This measures the portion of all decisions that were incorrect
77           decisions.  It is defined as "(b+c)/(a+b+c+d)".  It falls in the
78           range from 0 to 1, with 0 being the best score.
79
80           Note that macro-error and micro-error will always give the same
81           number.
82
83       ·   precision
84
85           This measures the portion of the assigned categories that were
86           correct.  It is defined as "a/(a+b)".  It falls in the range from 0
87           to 1, with 1 being the best score.
88
89       ·   recall
90
91           This measures the portion of the correct categories that were
92           assigned.  It is defined as "a/(a+c)".  It falls in the range from
93           0 to 1, with 1 being the best score.
94
95       ·   F1
96
97           This measures an even combination of precision and recall.  It is
98           defined as "2*p*r/(p+r)".  In terms of a, b, and c, it may be
99           expressed as "2a/(2a+b+c)".  It falls in the range from 0 to 1,
100           with 1 being the best score.
101
102       The F1 measure is often the only simple measure that is worth trying to
103       maximize on its own - consider the fact that you can get a perfect
104       precision score by always assigning zero categories, or a perfect
105       recall score by always assigning every category.  A truly smart system
106       will assign the correct categories and only the correct categories,
107       maximizing precision and recall at the same time, and therefore
108       maximizing the F1 score.
109
110       Sometimes it's worth trying to maximize the accuracy score, but
111       accuracy (and its counterpart error) are considered fairly crude scores
112       that don't give much information about the performance of a
113       categorizer.
114

METHODS

116       The general execution flow when using this class is to create a
117       "Statistics::Contingency" object, add a bunch of results to it, and
118       then report on the results.
119
120       ·   $e = Statistics::Contingency->new()
121
122           Returns a new "Statistics::Contingency" object.  Expects a
123           "categories" parameter specifying the entire set of categories that
124           may be assigned during this experiment.  Also accepts a "verbose"
125           parameter - if true, some diagnostic status information will be
126           displayed when certain actions are performed.
127
128       ·   $e->add_result($assigned_categories, $correct_categories, $name)
129
130           Adds a new result to the experiment.  The lists of assigned and
131           correct categories can be given as an array of category names
132           (strings), as a hash whose keys are the category names and whose
133           values are anything logically true, or as a single string if there
134           is only one category.
135
136           If you've already got the lists in hash form, this will be the
137           fastest way to pass them.  Otherwise, the current implementation
138           will convert them to hash form internally in order to make its
139           calculations efficient.
140
141           The $name parameter is an optional name for this result.  It will
142           only be used in error messages or debugging/progress output.
143
144           In the current implementation, we only store the contingency tables
145           per category, as well as a table for the entire result set.  This
146           means that you can't recover information about any particular
147           single result from the "Statistics::Contingency" object.
148
149       ·   $e->set_entries($a, $b, $c, $d)
150
151           If you don't wish to use the c<add_result()> interface, but still
152           take advantage of the calculation methods and the various edge
153           cases they handle, you can directly set the four elements of the
154           contingency table with this method.
155
156       ·   $e->micro_accuracy
157
158           Returns the micro-averaged accuracy for the data set.
159
160       ·   $e->micro_error
161
162           Returns the micro-averaged error for the data set.
163
164       ·   $e->micro_precision
165
166           Returns the micro-averaged precision for the data set.
167
168       ·   $e->micro_recall
169
170           Returns the micro-averaged recall for the data set.
171
172       ·   $e->micro_F1
173
174           Returns the micro-averaged F1 for the data set.
175
176       ·   $e->macro_accuracy
177
178           Returns the macro-averaged accuracy for the data set.
179
180       ·   $e->macro_error
181
182           Returns the macro-averaged error for the data set.
183
184       ·   $e->macro_precision
185
186           Returns the macro-averaged precision for the data set.
187
188       ·   $e->macro_recall
189
190           Returns the macro-averaged recall for the data set.
191
192       ·   $e->macro_F1
193
194           Returns the macro-averaged F1 for the data set.
195
196       ·   $e->stats_table
197
198           Returns a string combining several statistics in one graphic table.
199           Since accuracy is 1 minus error, we only report error since it
200           takes less space to print.  An optional argument specifies the
201           number of significant digits to show in the data - the default is 3
202           significant digits.
203
204       ·   $e->category_stats
205
206           Returns a hash reference whose keys are the names of each category,
207           and whose values contain the various statistical measures
208           (accuracy, error, precision, recall, or F1) about each category as
209           a hash reference.  For example, to print a single statistic:
210
211            print $e->category_stats->{sports}{recall}, "\n";
212
213           Or to print certain statistics for all categtories:
214
215            my $stats = $e->category_stats;
216            while (my ($cat, $value) = each %$stats) {
217              print "Category '$cat': \n";
218              print "  Accuracy: $value->{accuracy}\n";
219              print "  Precision: $value->{precision}\n";
220              print "  F1: $value->{F1}\n";
221            }
222

AUTHOR

224       Ken Williams <kwilliams@cpan.org>
225

COPYRIGHT

227       Copyright 2002-2008 Ken Williams.  All rights reserved.
228
229       This distribution is free software; you can redistribute it and/or
230       modify it under the same terms as Perl itself.
231
232
233
234perl v5.32.0                      2020-07-28        Statistics::Contingency(3)