1DBCOLPERCENTILE(1) User Contributed Perl Documentation DBCOLPERCENTILE(1)
2
3
4
6 dbcolpercentile - compute percentiles or ranks for an existing numeric
7 column
8
10 dbcolpercentile [-rplhS] [--mode MODE] [--value WEIGHT_COL] column
11
13 Compute a percentile, ranking, or weighted percentile of a column of
14 numbers. The new column will be called percentile:d or rank:q or
15 weighted:d depending on the mode.
16
17 Ordering is given by the specifed column.
18
19 In weighted mode, by default the same column as ordering is used for
20 weighting. Alternatively, give a different column for weighting with
21 "-v".
22
23 Non-numeric values are ignored.
24
25 If the data is pre-sorted and only a rank is requested, no extra
26 storage is required. In all other cases, a full copy of data is
27 buffered on disk. Output will be sorted by COLUMN.
28
30 -p or --percentile or --mode percentile
31 Show percentile (default). Percentile is the fraction of the
32 cumulative values at or lower than the current value, relative to
33 the total count.
34
35 -P or --rank or --nopercentile or --mode rank
36 Compute ranks instead of percentiles.
37
38 -w WEIGHT_COL or --weighted WEIGHT_COL or --mode weighted
39 Compute the weighted percentile. Here values define not only the
40 ordering, but the fraction of the total sum, and percentile is the
41 fraction of sum of cumulative values in the weighting column
42 (relative to their sum), for all ranking colums at or lower than
43 the current ranking column. If the weight column is not specified
44 (with "--mode weighted"), it is the same as the ranking column.
45
46 -a or --include-non-numeric
47 Compute stats over all records (treat non-numeric records as zero
48 rather than just ignoring them).
49
50 -S or --pre-sorted
51 Assume data is already sorted. With one -S, we check and confirm
52 this precondition. When repeated, we skip the check.
53
54 -N NAME or --new-name NAME
55 Give the NAME of the new column. (If no type is specifed, a type
56 will be assigned based on the mode.)
57
58 -f FORMAT or --format FORMAT
59 Specify a printf(3)-style format for output statistics. Defaults
60 to "%.5g".
61
62 -T TmpDir
63 where to put tmp files. Also uses environment variable TMPDIR, if
64 -T is not specified. Default is /tmp.
65
66 -e EmptyValue or --empty
67 Specify the value any non-numeric rows get, if in weighted mode.
68
69 This module also supports the standard fsdb options:
70
71 -d Enable debugging output.
72
73 -i or --input InputSource
74 Read from InputSource, typically a file name, or "-" for standard
75 input, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue
76 objects.
77
78 -o or --output OutputDestination
79 Write to OutputDestination, typically a file name, or "-" for
80 standard output, or (if in Perl) a IO::Handle, Fsdb::IO or
81 Fsdb::BoundedQueue objects.
82
83 --autorun or --noautorun
84 By default, programs process automatically, but Fsdb::Filter
85 objects in Perl do not run until you invoke the run() method. The
86 "--(no)autorun" option controls that behavior within Perl.
87
88 --help
89 Show help.
90
91 --man
92 Show full manual.
93
95 Input:
96 #fsdb name id test1
97 a 1 80
98 b 2 70
99 c 3 65
100 d 4 90
101 e 5 70
102 f 6 90
103
104 Command:
105 cat DATA/grades.fsdb | dbcolpercentile test1
106
107 Output:
108 #fsdb name id test1 percentile
109 d 4 90 1
110 f 6 90 1
111 a 1 80 0.66667
112 b 2 70 0.5
113 e 5 70 0.5
114 c 3 65 0.16667
115 # | dbsort -n test1
116 # | dbcolpercentile test1
117
118 Command 2:
119 cat DATA/grades.fsdb | dbcolpercentile --rank test1
120
121 Output 2:
122 #fsdb name id test1 rank
123 d 4 90 1
124 f 6 90 1
125 a 1 80 3
126 b 2 70 4
127 e 5 70 4
128 c 3 65 6
129 # | dbsort -n test1
130 # | dbcolpercentile --rank test1
131
133 Fsdb. dbcolhisto.
134
136 Copyright (C) 1991-2022 by John Heidemann <johnh@isi.edu>
137
138 This program is distributed under terms of the GNU general public
139 license, version 2. See the file COPYING with the distribution for
140 details.
141
142
143
144perl v5.38.0 2023-07-20 DBCOLPERCENTILE(1)