1Fsdb::Filter::dbcolpercUesnetrilCeo(n3t)ributed Perl DocFusmdebn:t:aFtiilotner::dbcolpercentile(3)
2
3
4
6 dbcolpercentile - compute percentiles or ranks for an existing numeric
7 column
8
10 dbcolpercentile [-rplhS] [--mode MODE] [--value WEIGHT_COL] column
11
13 Compute a percentile, ranking, or weighted percentile of a column of
14 numbers. The new column will be called percentile or rank or weighted
15 depending on the mode.
16
17 Ordering is given by the specifed column.
18
19 In weighted mode, by default the same column as ordering is used for
20 weighting. Alternatively, give a different column for weighting with
21 "-v".
22
23 Non-numeric values are ignored.
24
25 If the data is pre-sorted and only a rank is requested, no extra
26 storage is required. In all other cases, a full copy of data is
27 buffered on disk.
28
30 -p or --percentile or --mode percentile
31 Show percentile (default). Percentile is the fraction of the
32 cumulative values at or lower than the current value, relative to
33 the total count.
34
35 -P or --rank or --nopercentile or --mode rank
36 Compute ranks instead of percentiles.
37
38 -w WEIGHT_COL or --weighted WEIGHT_COL or --mode weighted
39 Compute the weighted percentile. Here values define not only the
40 ordering, but the fraction of the total sum, and percentile is the
41 fraction of sum of cumulative values in the weighting column
42 (relative to their sum), for all ranking colums at or lower than
43 the current ranking column. If the weight column is not specified
44 (with "--mode weighted"), it is the same as the ranking column.
45
46 -a or --include-non-numeric
47 Compute stats over all records (treat non-numeric records as zero
48 rather than just ignoring them).
49
50 -S or --pre-sorted
51 Assume data is already sorted. With one -S, we check and confirm
52 this precondition. When repeated, we skip the check.
53
54 -N NAME or --new-name NAME
55 Give the NAME of the new column.
56
57 -f FORMAT or --format FORMAT
58 Specify a printf(3)-style format for output statistics. Defaults
59 to "%.5g".
60
61 -T TmpDir
62 where to put tmp files. Also uses environment variable TMPDIR, if
63 -T is not specified. Default is /tmp.
64
65 -e EmptyValue or --empty
66 Specify the value any non-numeric rows get, if in weighted mode.
67
68 This module also supports the standard fsdb options:
69
70 -d Enable debugging output.
71
72 -i or --input InputSource
73 Read from InputSource, typically a file name, or "-" for standard
74 input, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue
75 objects.
76
77 -o or --output OutputDestination
78 Write to OutputDestination, typically a file name, or "-" for
79 standard output, or (if in Perl) a IO::Handle, Fsdb::IO or
80 Fsdb::BoundedQueue objects.
81
82 --autorun or --noautorun
83 By default, programs process automatically, but Fsdb::Filter
84 objects in Perl do not run until you invoke the run() method. The
85 "--(no)autorun" option controls that behavior within Perl.
86
87 --help
88 Show help.
89
90 --man
91 Show full manual.
92
94 Input:
95 #fsdb name id test1
96 a 1 80
97 b 2 70
98 c 3 65
99 d 4 90
100 e 5 70
101 f 6 90
102
103 Command:
104 cat DATA/grades.fsdb | dbcolpercentile test1
105
106 Output:
107 #fsdb name id test1 percentile
108 d 4 90 1
109 f 6 90 1
110 a 1 80 0.66667
111 b 2 70 0.5
112 e 5 70 0.5
113 c 3 65 0.16667
114 # | dbsort -n test1
115 # | dbcolpercentile test1
116
117 Command 2:
118 cat DATA/grades.fsdb | dbcolpercentile --rank test1
119
120 Output 2:
121 #fsdb name id test1 rank
122 d 4 90 1
123 f 6 90 1
124 a 1 80 3
125 b 2 70 4
126 e 5 70 4
127 c 3 65 6
128 # | dbsort -n test1
129 # | dbcolpercentile --rank test1
130
132 Fsdb. dbcolhisto.
133
135 new
136 $filter = new Fsdb::Filter::dbcolpercentile(@arguments);
137
138 Create a new dbcolpercentile object, taking command-line arguments.
139
140 set_defaults
141 $filter->set_defaults();
142
143 Internal: set up defaults.
144
145 parse_options
146 $filter->parse_options(@ARGV);
147
148 Internal: parse command-line arguments.
149
150 setup
151 $filter->setup();
152
153 Internal: setup, parse headers.
154
155 _determine_total
156 $n = $self->_determinte_total()
157
158 Interpose a filter on "$self-"{_in}> that counts the rows (for rank or
159 percentile) or sums the value (for weighted percentile).
160
161 run
162 $filter->run();
163
164 Internal: run over each rows.
165
167 Copyright (C) 1991-2022 by John Heidemann <johnh@isi.edu>
168
169 This program is distributed under terms of the GNU general public
170 license, version 2. See the file COPYING with the distribution for
171 details.
172
173
174
175perl v5.34.1 2022-04-04 Fsdb::Filter::dbcolpercentile(3)