1DBCOLPERCENTILE(1)    User Contributed Perl Documentation   DBCOLPERCENTILE(1)
2
3
4

NAME

6       dbcolpercentile - compute percentiles or ranks for an existing numeric
7       column
8

SYNOPSIS

10           dbcolpercentile [-rplhS] [--mode MODE] [--value WEIGHT_COL] column
11

DESCRIPTION

13       Compute a percentile, ranking, or weighted percentile of a column of
14       numbers.  The new column will be called percentile or rank or weighted
15       depending on the mode.
16
17       Ordering is given by the specifed column.
18
19       In weighted mode, by default the same column as ordering is used for
20       weighting.  Alternatively, give a different column for weighting with
21       "-v".
22
23       Non-numeric values are ignored.
24
25       If the data is pre-sorted and only a rank is requested, no extra
26       storage is required.  In all other cases, a full copy of data is
27       buffered on disk.
28

OPTIONS

30       -p or --percentile or --mode percentile
31           Show percentile (default).  Percentile is the fraction of the
32           cumulative values at or lower than the current value, relative to
33           the total count.
34
35       -P or --rank or --nopercentile or --mode rank
36           Compute ranks instead of percentiles.
37
38       -w WEIGHT_COL or --weighted WEIGHT_COL or --mode weighted
39           Compute the weighted percentile.  Here values define not only the
40           ordering, but the fraction of the total sum, and percentile is the
41           fraction of sum of cumulative values in the weighting column
42           (relative to their sum), for all ranking colums at or lower than
43           the current ranking column.  If the weight column is not specified
44           (with "--mode weighted"), it is the same as the ranking column.
45
46       -a or --include-non-numeric
47           Compute stats over all records (treat non-numeric records as zero
48           rather than just ignoring them).
49
50       -S or --pre-sorted
51           Assume data is already sorted.  With one -S, we check and confirm
52           this precondition.  When repeated, we skip the check.
53
54       -N NAME or --new-name NAME
55           Give the NAME of the new column.
56
57       -f FORMAT or --format FORMAT
58           Specify a printf(3)-style format for output statistics.  Defaults
59           to "%.5g".
60
61       -T TmpDir
62           where to put tmp files.  Also uses environment variable TMPDIR, if
63           -T is not specified.  Default is /tmp.
64
65       -e EmptyValue or --empty
66           Specify the value any non-numeric rows get, if in weighted mode.
67
68       This module also supports the standard fsdb options:
69
70       -d  Enable debugging output.
71
72       -i or --input InputSource
73           Read from InputSource, typically a file name, or "-" for standard
74           input, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue
75           objects.
76
77       -o or --output OutputDestination
78           Write to OutputDestination, typically a file name, or "-" for
79           standard output, or (if in Perl) a IO::Handle, Fsdb::IO or
80           Fsdb::BoundedQueue objects.
81
82       --autorun or --noautorun
83           By default, programs process automatically, but Fsdb::Filter
84           objects in Perl do not run until you invoke the run() method.  The
85           "--(no)autorun" option controls that behavior within Perl.
86
87       --help
88           Show help.
89
90       --man
91           Show full manual.
92

SAMPLE USAGE

94   Input:
95           #fsdb name id test1
96           a 1 80
97           b 2 70
98           c 3 65
99           d 4 90
100           e 5 70
101           f 6 90
102
103   Command:
104           cat DATA/grades.fsdb | dbcolpercentile test1
105
106   Output:
107               #fsdb name id test1 percentile
108               d       4       90      1
109               f       6       90      1
110               a       1       80      0.66667
111               b       2       70      0.5
112               e       5       70      0.5
113               c       3       65      0.16667
114               #  | dbsort -n test1
115               #   | dbcolpercentile test1
116
117   Command 2:
118           cat DATA/grades.fsdb | dbcolpercentile --rank test1
119
120   Output 2:
121               #fsdb name id test1 rank
122               d       4       90      1
123               f       6       90      1
124               a       1       80      3
125               b       2       70      4
126               e       5       70      4
127               c       3       65      6
128               #  | dbsort -n test1
129               #   | dbcolpercentile --rank test1
130

SEE ALSO

132       Fsdb.  dbcolhisto.
133
135       Copyright (C) 1991-2022 by John Heidemann <johnh@isi.edu>
136
137       This program is distributed under terms of the GNU general public
138       license, version 2.  See the file COPYING with the distribution for
139       details.
140
141
142
143perl v5.34.1                      2022-04-04                DBCOLPERCENTILE(1)
Impressum