1DBCOLPERCENTILE(1)    User Contributed Perl Documentation   DBCOLPERCENTILE(1)
2
3
4

NAME

6       dbcolpercentile - compute percentiles or ranks for an existing numeric
7       column
8

SYNOPSIS

10           dbcolpercentile [-rplhS] [--mode MODE] [--value WEIGHT_COL] column
11

DESCRIPTION

13       Compute a percentile, ranking, or weighted percentile of a column of
14       numbers.  The new column will be called percentile:d or rank:q or
15       weighted:d depending on the mode.
16
17       Ordering is given by the specifed column.
18
19       In weighted mode, by default the same column as ordering is used for
20       weighting.  Alternatively, give a different column for weighting with
21       "-v".
22
23       Non-numeric values are ignored.
24
25       If the data is pre-sorted and only a rank is requested, no extra
26       storage is required.  In all other cases, a full copy of data is
27       buffered on disk.  Output will be sorted by COLUMN.
28

OPTIONS

30       -p or --percentile or --mode percentile
31           Show percentile (default).  Percentile is the fraction of the
32           cumulative values at or lower than the current value, relative to
33           the total count.
34
35       -P or --rank or --nopercentile or --mode rank
36           Compute ranks instead of percentiles.
37
38       -w WEIGHT_COL or --weighted WEIGHT_COL or --mode weighted
39           Compute the weighted percentile.  Here values define not only the
40           ordering, but the fraction of the total sum, and percentile is the
41           fraction of sum of cumulative values in the weighting column
42           (relative to their sum), for all ranking colums at or lower than
43           the current ranking column.  If the weight column is not specified
44           (with "--mode weighted"), it is the same as the ranking column.
45
46       -a or --include-non-numeric
47           Compute stats over all records (treat non-numeric records as zero
48           rather than just ignoring them).
49
50       -S or --pre-sorted
51           Assume data is already sorted.  With one -S, we check and confirm
52           this precondition.  When repeated, we skip the check.
53
54       -N NAME or --new-name NAME
55           Give the NAME of the new column.  (If no type is specifed, a type
56           will be assigned based on the mode.)
57
58       -f FORMAT or --format FORMAT
59           Specify a printf(3)-style format for output statistics.  Defaults
60           to "%.5g".
61
62       -T TmpDir
63           where to put tmp files.  Also uses environment variable TMPDIR, if
64           -T is not specified.  Default is /tmp.
65
66       -e EmptyValue or --empty
67           Specify the value any non-numeric rows get, if in weighted mode.
68
69       This module also supports the standard fsdb options:
70
71       -d  Enable debugging output.
72
73       -i or --input InputSource
74           Read from InputSource, typically a file name, or "-" for standard
75           input, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue
76           objects.
77
78       -o or --output OutputDestination
79           Write to OutputDestination, typically a file name, or "-" for
80           standard output, or (if in Perl) a IO::Handle, Fsdb::IO or
81           Fsdb::BoundedQueue objects.
82
83       --autorun or --noautorun
84           By default, programs process automatically, but Fsdb::Filter
85           objects in Perl do not run until you invoke the run() method.  The
86           "--(no)autorun" option controls that behavior within Perl.
87
88       --help
89           Show help.
90
91       --man
92           Show full manual.
93

SAMPLE USAGE

95   Input:
96           #fsdb name id test1
97           a 1 80
98           b 2 70
99           c 3 65
100           d 4 90
101           e 5 70
102           f 6 90
103
104   Command:
105           cat DATA/grades.fsdb | dbcolpercentile test1
106
107   Output:
108               #fsdb name id test1 percentile
109               d       4       90      1
110               f       6       90      1
111               a       1       80      0.66667
112               b       2       70      0.5
113               e       5       70      0.5
114               c       3       65      0.16667
115               #  | dbsort -n test1
116               #   | dbcolpercentile test1
117
118   Command 2:
119           cat DATA/grades.fsdb | dbcolpercentile --rank test1
120
121   Output 2:
122               #fsdb name id test1 rank
123               d       4       90      1
124               f       6       90      1
125               a       1       80      3
126               b       2       70      4
127               e       5       70      4
128               c       3       65      6
129               #  | dbsort -n test1
130               #   | dbcolpercentile --rank test1
131

SEE ALSO

133       Fsdb.  dbcolhisto.
134
136       Copyright (C) 1991-2022 by John Heidemann <johnh@isi.edu>
137
138       This program is distributed under terms of the GNU general public
139       license, version 2.  See the file COPYING with the distribution for
140       details.
141
142
143
144perl v5.38.0                      2023-07-20                DBCOLPERCENTILE(1)
Impressum