1Fsdb::Filter::dbcolpercUesnetrilCeo(n3t)ributed Perl DocFusmdebn:t:aFtiilotner::dbcolpercentile(3)
2
3
4

NAME

6       dbcolpercentile - compute percentiles or ranks for an existing numeric
7       column
8

SYNOPSIS

10           dbcolpercentile [-rplhS] [--mode MODE] [--value WEIGHT_COL] column
11

DESCRIPTION

13       Compute a percentile, ranking, or weighted percentile of a column of
14       numbers.  The new column will be called percentile:d or rank:q or
15       weighted:d depending on the mode.
16
17       Ordering is given by the specifed column.
18
19       In weighted mode, by default the same column as ordering is used for
20       weighting.  Alternatively, give a different column for weighting with
21       "-v".
22
23       Non-numeric values are ignored.
24
25       If the data is pre-sorted and only a rank is requested, no extra
26       storage is required.  In all other cases, a full copy of data is
27       buffered on disk.  Output will be sorted by COLUMN.
28

OPTIONS

30       -p or --percentile or --mode percentile
31           Show percentile (default).  Percentile is the fraction of the
32           cumulative values at or lower than the current value, relative to
33           the total count.
34
35       -P or --rank or --nopercentile or --mode rank
36           Compute ranks instead of percentiles.
37
38       -w WEIGHT_COL or --weighted WEIGHT_COL or --mode weighted
39           Compute the weighted percentile.  Here values define not only the
40           ordering, but the fraction of the total sum, and percentile is the
41           fraction of sum of cumulative values in the weighting column
42           (relative to their sum), for all ranking colums at or lower than
43           the current ranking column.  If the weight column is not specified
44           (with "--mode weighted"), it is the same as the ranking column.
45
46       -a or --include-non-numeric
47           Compute stats over all records (treat non-numeric records as zero
48           rather than just ignoring them).
49
50       -S or --pre-sorted
51           Assume data is already sorted.  With one -S, we check and confirm
52           this precondition.  When repeated, we skip the check.
53
54       -N NAME or --new-name NAME
55           Give the NAME of the new column.  (If no type is specifed, a type
56           will be assigned based on the mode.)
57
58       -f FORMAT or --format FORMAT
59           Specify a printf(3)-style format for output statistics.  Defaults
60           to "%.5g".
61
62       -T TmpDir
63           where to put tmp files.  Also uses environment variable TMPDIR, if
64           -T is not specified.  Default is /tmp.
65
66       -e EmptyValue or --empty
67           Specify the value any non-numeric rows get, if in weighted mode.
68
69       This module also supports the standard fsdb options:
70
71       -d  Enable debugging output.
72
73       -i or --input InputSource
74           Read from InputSource, typically a file name, or "-" for standard
75           input, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue
76           objects.
77
78       -o or --output OutputDestination
79           Write to OutputDestination, typically a file name, or "-" for
80           standard output, or (if in Perl) a IO::Handle, Fsdb::IO or
81           Fsdb::BoundedQueue objects.
82
83       --autorun or --noautorun
84           By default, programs process automatically, but Fsdb::Filter
85           objects in Perl do not run until you invoke the run() method.  The
86           "--(no)autorun" option controls that behavior within Perl.
87
88       --help
89           Show help.
90
91       --man
92           Show full manual.
93

SAMPLE USAGE

95   Input:
96           #fsdb name id test1
97           a 1 80
98           b 2 70
99           c 3 65
100           d 4 90
101           e 5 70
102           f 6 90
103
104   Command:
105           cat DATA/grades.fsdb | dbcolpercentile test1
106
107   Output:
108               #fsdb name id test1 percentile
109               d       4       90      1
110               f       6       90      1
111               a       1       80      0.66667
112               b       2       70      0.5
113               e       5       70      0.5
114               c       3       65      0.16667
115               #  | dbsort -n test1
116               #   | dbcolpercentile test1
117
118   Command 2:
119           cat DATA/grades.fsdb | dbcolpercentile --rank test1
120
121   Output 2:
122               #fsdb name id test1 rank
123               d       4       90      1
124               f       6       90      1
125               a       1       80      3
126               b       2       70      4
127               e       5       70      4
128               c       3       65      6
129               #  | dbsort -n test1
130               #   | dbcolpercentile --rank test1
131

SEE ALSO

133       Fsdb.  dbcolhisto.
134

CLASS FUNCTIONS

136   new
137           $filter = new Fsdb::Filter::dbcolpercentile(@arguments);
138
139       Create a new dbcolpercentile object, taking command-line arguments.
140
141   set_defaults
142           $filter->set_defaults();
143
144       Internal: set up defaults.
145
146   parse_options
147           $filter->parse_options(@ARGV);
148
149       Internal: parse command-line arguments.
150
151   setup
152           $filter->setup();
153
154       Internal: setup, parse headers.
155
156   _determine_total
157           $n = $self->_determinte_total()
158
159       Interpose a filter on "$self-"{_in}> that counts the rows (for rank or
160       percentile) or sums the value (for weighted percentile).
161
162   run
163           $filter->run();
164
165       Internal: run over each rows.
166
168       Copyright (C) 1991-2022 by John Heidemann <johnh@isi.edu>
169
170       This program is distributed under terms of the GNU general public
171       license, version 2.  See the file COPYING with the distribution for
172       details.
173
174
175
176perl v5.38.0                      2023-07-20  Fsdb::Filter::dbcolpercentile(3)
Impressum