1Fsdb::Filter::dbfilepivUoste(r3)Contributed Perl DocumenFtsadtbi:o:nFilter::dbfilepivot(3)
2
3
4
6 dbfilepivot - pivot a table, converting multiple rows into single wide
7 row
8
10 dbfilepivot [-e empty] -k KeyField -p PivotField [-v ValueField]
11
13 Pivot a table, converting multiple rows corresponding to the same key
14 into a single wide row.
15
16 In a normalized database, one might have data with a schema like (id,
17 attribute, value), but sometimes it's more convenient to see the data
18 with a schema like (id, attribute1, attribute2). (For example,
19 gnuplot's stacked histograms requires denormalized data.) Dbfilepivot
20 converts the normalized format to the denormalized, but sometimes
21 useful, format. Here the "id" is the key, the attribute is the
22 "pivot", and the value is, well, the optional "value".
23
24 An example is clearer. A gradebook usually looks like:
25
26 #fsdb name hw_1 hw_2 hw_3
27 John 97 98 99
28 Paul - 80 82
29
30 but a properly normalized format would represent it as:
31
32 #fsdb name hw score
33 John 1 97
34 John 2 98
35 John 3 99
36 Paul 2 80
37 Paul 3 82
38
39 This tool converts the second form into the first, when used as
40
41 dbfilepivot -k name -p hw -v score
42
43 or
44
45 dbfilepivot --possible-pivots='1 2 3' -k name -p hw -v score
46
47 Here name is the key column that indicates which rows belong to the
48 same entity, hw is the pivot column that will be indicate which column
49 in the output is relevant, and score is the value that indicates what
50 goes in the output.
51
52 The pivot creates a new column "key_tag1", "key_tag2", etc. for each
53 tag, the contents of the pivot field in the input. It then populates
54 those new columns with the contents of the value field in the input.
55
56 If no value column is specified, then values are either empty or 1.
57
58 Dbfilepivot assumes all lines with the same key are adjacent in the
59 input source, like dbmapreduce(1) with the -S option. To enforce this
60 invariant, by default, it requires input be sorted by key.
61
62 There is no requirement that the pivot field be sorted (provided the
63 key field is already sorted).
64
65 By default, dbfilepivot makes two passes over its data and so requires
66 temporary disk space equal to the input size. With the
67 --possible-pivots option, the user can specify pivots and skip the
68 second pass and avoid temporary data storage.
69
70 Memory usage is proportional to the number of unique pivot values.
71
72 The inverse of this commend is dbcolsplittorows.
73
75 -k or --key KeyField
76 specify which column is the key for grouping. Required (no
77 default).
78
79 -p or --pivot PivotField
80 specify which column is the key to indicate which column in the
81 output is relevant. Required (no default).
82
83 -v or --value ValueField
84 Specify which column is the value in the output. If none is given,
85 1 is used for the value.
86
87 --possible-pivots PP
88 Specify all possible pivot values as PP, a whitespace-separated
89 list. With this option, data is processed only once (not twice).
90
91 -C S or --element-separator S
92 Specify the separator S used to join the input's key column with
93 its contents. (Defaults to a single underscore.)
94
95 -e E or --empty E
96 give value E as the value for empty (null) records
97
98 -S or --pre-sorted
99 Assume data is already grouped by key. Provided twice, it removes
100 the validation of this assertion. By default, we sort by key.
101
102 -T TmpDir
103 where to put tmp files. Also uses environment variable TMPDIR, if
104 -T is not specified. Default is /tmp.
105
106 This module also supports the standard fsdb options:
107
108 -d Enable debugging output.
109
110 -i or --input InputSource
111 Read from InputSource, typically a file name, or "-" for standard
112 input, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue
113 objects.
114
115 -o or --output OutputDestination
116 Write to OutputDestination, typically a file name, or "-" for
117 standard output, or (if in Perl) a IO::Handle, Fsdb::IO or
118 Fsdb::BoundedQueue objects.
119
120 --autorun or --noautorun
121 By default, programs process automatically, but Fsdb::Filter
122 objects in Perl do not run until you invoke the run() method. The
123 "--(no)autorun" option controls that behavior within Perl.
124
125 --help
126 Show help.
127
128 --man
129 Show full manual.
130
132 Input:
133 #fsdb name hw score
134 John 1 97
135 John 2 98
136 John 3 99
137 Paul 2 80
138 Paul 3 82
139
140 Command:
141 cat data.fsdb | dbfilepivot -k name -p hw -v score
142
143 Output:
144 #fsdb name hw_1 hw_2 hw_3
145 John 97 98 99
146 Paul - 80 82
147 # | dbfilepivot -k name -p hw -v score
148
150 Fsdb(3). dbcolmerge(1). dbcolsplittorows(1). dbcolsplittocols(1).
151
153 new
154 $filter = new Fsdb::Filter::dbfilepivot(@arguments);
155
156 Create a new dbfilepivot object, taking command-line arguments.
157
158 set_defaults
159 $filter->set_defaults();
160
161 Internal: set up defaults.
162
163 parse_options
164 $filter->parse_options(@ARGV);
165
166 Internal: parse command-line arguments.
167
168 _find_possible_pivots
169 $filter->_find_possible_pivots();
170
171 Internal: scan input data to find all possible pivot values.
172
173 Returns npivots, pivots_aref.
174
175 _given_possible_pivots
176 $filter->_given_possible_pivots();
177
178 Internal: parse option of possible pivots.
179
180 Returns npivots, pivots_href.
181
182 setup
183 $filter->setup();
184
185 Internal: setup, parse headers.
186
187 run
188 $filter->run();
189
190 Internal: run over each rows.
191
193 Copyright (C) 2011-2018 by John Heidemann <johnh@isi.edu>
194
195 This program is distributed under terms of the GNU general public
196 license, version 2. See the file COPYING with the distribution for
197 details.
198
199
200
201perl v5.28.1 2018-02-17 Fsdb::Filter::dbfilepivot(3)