Fsdb::Filter::dbfilepivot(3pm)

1Fsdb::Filter::dbfilepivUoste(r3)Contributed Perl DocumenFtsadtbi:o:nFilter::dbfilepivot(3)
2
3
4

NAME

6       dbfilepivot - pivot a table, converting multiple rows into single wide
7       row
8

SYNOPSIS

10       dbfilepivot [-e empty] -k KeyField -p PivotField [-v ValueField]
11

DESCRIPTION

13       Pivot a table, converting multiple rows corresponding to the same key
14       into a single wide row.
15
16       In a normalized database, one might have data with a schema like (id,
17       attribute, value), but sometimes it's more convenient to see the data
18       with a schema like (id, attribute1, attribute2).  (For example,
19       gnuplot's stacked histograms requires denormalized data.)  Dbfilepivot
20       converts the normalized format to the denormalized, but sometimes
21       useful, format.  Here the "id" is the key, the attribute is the
22       "pivot", and the value is, well, the optional "value".
23
24       An example is clearer.  A gradebook usually looks like:
25
26           #fsdb name hw_1 hw_2 hw_3
27           John       97  98  99
28           Paul       -   80  82
29
30       but a properly normalized format would represent it as:
31
32           #fsdb name hw score
33           John       1  97
34           John       2  98
35           John       3  99
36           Paul       2  80
37           Paul       3  82
38
39       This tool converts the second form into the first, when used as
40
41           dbfilepivot -k name -p hw -v score
42
43       or
44
45           dbfilepivot --possible-pivots='1 2 3' -k name -p hw -v score
46
47       Here name is the key column that indicates which rows belong to the
48       same entity, hw is the pivot column that will be indicate which column
49       in the output is relevant, and score is the value that indicates what
50       goes in the output.
51
52       The pivot creates a new column "key_tag1", "key_tag2", etc.  for each
53       tag, the contents of the pivot field in the input.  It then populates
54       those new columns with the contents of the value field in the input.
55
56       If no value column is specified, then values are either empty or 1.
57
58       Dbfilepivot assumes all lines with the same key are adjacent in the
59       input source, like dbmapreduce(1) with the -S option.  To enforce this
60       invariant, by default, it requires input be sorted by key.
61
62       There is no requirement that the pivot field be sorted (provided the
63       key field is already sorted).
64
65       By default, dbfilepivot makes two passes over its data and so requires
66       temporary disk space equal to the input size.  With the
67       --possible-pivots option, the user can specify pivots and skip the
68       second pass and avoid temporary data storage.
69
70       Memory usage is proportional to the number of unique pivot values.
71
72       The inverse of this commend is dbcolsplittorows.
73

OPTIONS

75       -k or --key KeyField
76           specify which column is the key for grouping.  Required (no
77           default).
78
79       -p or --pivot PivotField
80           specify which column is the key to indicate which column in the
81           output is relevant.  Required (no default).
82
83       -v or --value ValueField
84           Specify which column is the value in the output.  If none is given,
85           1 is used for the value.
86
87       --possible-pivots PP
88           Specify all possible pivot values as PP, a whitespace-separated
89           list.  With this option, data is processed only once (not twice).
90
91       -C S or --element-separator S
92           Specify the separator S used to join the input's key column with
93           its contents.  (Defaults to a single underscore.)
94
95       -e E or --empty E
96           give value E as the value for empty (null) records
97
98       -S or --pre-sorted
99           Assume data is already grouped by key.  Provided twice, it removes
100           the validation of this assertion.  By default, we sort by key.
101
102       -T TmpDir
103           where to put tmp files.  Also uses environment variable TMPDIR, if
104           -T is not specified.  Default is /tmp.
105
106       This module also supports the standard fsdb options:
107
108       -d  Enable debugging output.
109
110       -i or --input InputSource
111           Read from InputSource, typically a file name, or "-" for standard
112           input, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue
113           objects.
114
115       -o or --output OutputDestination
116           Write to OutputDestination, typically a file name, or "-" for
117           standard output, or (if in Perl) a IO::Handle, Fsdb::IO or
118           Fsdb::BoundedQueue objects.
119
120       --autorun or --noautorun
121           By default, programs process automatically, but Fsdb::Filter
122           objects in Perl do not run until you invoke the run() method.  The
123           "--(no)autorun" option controls that behavior within Perl.
124
125       --help
126           Show help.
127
128       --man
129           Show full manual.
130

SAMPLE USAGE

132   Input:
133               #fsdb name hw score
134               John       1  97
135               John       2  98
136               John       3  99
137               Paul       2  80
138               Paul       3  82
139
140   Command:
141           cat data.fsdb | dbfilepivot -k name -p hw -v score
142
143   Output:
144               #fsdb name hw_1 hw_2 hw_3
145               John    97      98      99
146               Paul    -       80      82
147               #   | dbfilepivot -k name -p hw -v score
148

CLASS FUNCTIONS

153   new
154           $filter = new Fsdb::Filter::dbfilepivot(@arguments);
155
156       Create a new dbfilepivot object, taking command-line arguments.
157
158   set_defaults
159           $filter->set_defaults();
160
161       Internal: set up defaults.
162
163   parse_options
164           $filter->parse_options(@ARGV);
165
166       Internal: parse command-line arguments.
167
168   _find_possible_pivots
169           $filter->_find_possible_pivots();
170
171       Internal: scan input data to find all possible pivot values.
172
173       Returns npivots, pivots_aref.
174
175   _given_possible_pivots
176           $filter->_given_possible_pivots();
177
178       Internal: parse option of possible pivots.
179
180       Returns npivots, pivots_href.
181
182   setup
183           $filter->setup();
184
185       Internal: setup, parse headers.
186
187   run
188           $filter->run();
189
190       Internal: run over each rows.
191

AUTHOR and COPYRIGHT

193       Copyright (C) 2011-2022 by John Heidemann <johnh@isi.edu>
194
195       This program is distributed under terms of the GNU general public
196       license, version 2.  See the file COPYING with the distribution for
197       details.
198
199
200
201perl v5.36.0                      2022-11-22      Fsdb::Filter::dbfilepivot(3)