1Fsdb::IO(3) User Contributed Perl Documentation Fsdb::IO(3)
2
3
4
6 Fsdb::IO - base class for Fsdb IO (FsdbReader and FsdbWriter)
7
9 There are several ways to do IO. We look at several that compute the
10 product of x and y for this input:
11
12 #fsdb x y product
13 1 10 -
14 2 20 -
15
16 The following routes go from most easy-to-use to least, and also from
17 least efficient to most. For IO-intensive work, if fastpath takes 1
18 unit of time, then using hashes or arrays takes approximately 2 units
19 of time, all due to CPU overhead.
20
21 Using A Hash
22 use Fsdb::IO::Reader;
23 use Fsdb::IO::Writer;
24
25 # preamble
26 my $out;
27 my $in = new Fsdb::IO::Reader(-file => '-', -comment_handler => \$out)
28 or die "cannot open stdin as fsdb\n";
29 $out = new Fsdb::IO::Writer(-file => '-', -clone => $in)
30 or die "cannot open stdin as fsdb\n";
31
32 # core starts here
33 my %hrow;
34 while ($in->read_row_to_href(\%hrow)) {
35 $hrow{product} = $hrow{x} * $hrow{y};
36 $out->write_row_from_href(\%hrow);
37 };
38
39 It can be convenient to use a hash because one can easily extract
40 fields using hash keys, but hashes can be slow.
41
42 Arrays Instead of Hashes
43 We can add a bit to end of the preamble:
44
45 my $x_i = $in->col_to_i('x') // die "no x column.\n";
46 my $y_i = $in->col_to_i('y') // die "no y column.\n";
47 my $product_i = $in->col_to_i('product') // die "no product column.\n";
48
49 And then replace the core with arrays:
50
51 my @arow;
52 while ($in->read_row_to_aref(\@arow)) {
53 $arow[$product_i] = $arow[$x_i] * $arow[$y_i];
54 $out->write_row_from_aref(\@arow);
55 };
56
57 This code has two advantages over hrefs: First, there is explicit error
58 checking for presence of the expected fields. Second, arrays are
59 likely a bit faster than hashes.
60
61 Objects Instead of Arrays
62 Keeping the same preamble as for arrays, we can directly get internal
63 Fsdb "row objects" with a new core:
64
65 # core
66 my $rowobj;
67 while ($rowobj = $in->read_rowobj) {
68 if (!ref($rowobj)) {
69 # comment
70 &{$in->{_comment_sub}}($rowobj);
71 next;
72 };
73 $rowobj->[$product_i] = $rowobj->[$x_i] * $rowobj->[$y_i];
74 $out->write_rowobj($rowobj);
75 };
76
77 This code is a bit faster because we just return the internal
78 representation (a rowobj), rather than copy into an array.
79
80 However, unfortunately it doesn't handle comment processing.
81
82 Fastpathing
83 To go really fast, we can build a custom thunk (a chunk of code) that
84 does exactly what we want. This approach is called a "fastpath".
85
86 It requires a bit more in the preamble (building on the array version):
87
88 my $in_fastpath_sub = $in->fastpath_sub();
89 my $out_fastpath_sub = $out->fastpath_sub();
90
91 And it allows a shorter core (modeled on rowobjs), since the fastpath
92 includes comment processing:
93
94 my $rowobj;
95 while ($rowobj = &$in_fastpath_sub) {
96 $rowobj->[$product_i] = $rowobj->[$x_i] * $rowobj->[$y_i];
97 &$out_fastpath_sub($rowobj);
98 };
99
100 This code is the fastest way to implement this block without evaling
101 code.
102
104 new
105 $fsdb = new Fsdb::IO;
106
107 Creates a new IO object. Usually you should not create a FsdbIO object
108 directly, but instead create a "FsdbReader" or "FsdbWriter".
109
110 Options:
111
112 -fh FILE_HANDLE Write IO to the given file handle.
113 -header HEADER_LINE Force the header to the given HEADER_LINE (should
114 be verbatim, including #h or whatever). =back
115 -fscode CODE Define just the column (or field) separator fscode part of
116 the header. See dbfilealter for a list of valid field separators.
117 -rscode CODE Define just the row separator part of the header. See
118 dbfilealter for a list of valid row separators.
119 -cols CODE Define just the columns of the header.
120 -compression CODE Define the compression mode for the file that will
121 take effect after the header.
122 -clone $fsdb Copy the stream's configuration from $FSDB, another
123 Fsdb::IO object.
124
125 _reset_cols
126 $fsdb->_reset_cols
127
128 Internal: zero all the mappings in the curren schema.
129
130 config_one
131 $fsdb->config_one($arglist_aref);
132
133 Parse the first configuration option on the list, removing it.
134
135 Options are listed in new.
136
137 config
138 $fsdb->config(-arg1 => $value1, -arg2 => $value2);
139
140 Parse all options in the list.
141
142 default_binmode
143 $fsdb->default_binmode();
144
145 Set the file to the correct binmode, either given by "-encoding" at
146 setup, or defaulting from "LC_CTYPE" or "LANG".
147
148 If the file is compressed, we will reset binmode after reading the
149 header.
150
151 compare
152 $result = $fsdb->compare($other_fsdb)
153
154 Compares two Fsdb::IO objects, returning the strings "identical" (same
155 field separator, columns, and column order), or maybe "compatible"
156 (same field separator but different columns), or undef if they differ.
157
158 close
159 $fsdb->close;
160
161 Closes the file, frees open file handle, or sends an EOF signal (and
162 undef) down the open queue.
163
164 error
165 $fsdb->error;
166
167 Returns a descriptive string if there is an error, or undef if not.
168
169 The string will never end in a newline or punctuation.
170
171 update_v1_headerrow
172 internal: create the header the internal schema
173
174 parse_v1_headerrow
175 internal: interpet the header
176
177 update_headerrow
178 internal: create the header the internal schema
179
180 parse_headerrow
181 internal: interpet the v2 header. Format is:
182
183 #fsdb [-F x] [-R x] [-Z x] columns
184
185 All options must come first, start with dashes, and have an argument.
186 (More regular than the v1 header.)
187
188 parse_v1_fscode
189 internal
190
191 parse_fscode
192 Parse the field separator. See dbfilealter for a list of valid values.
193
194 parse_rscode
195 Internal: Interpret rscodes.
196
197 See dbfilealter for a list of valid values.
198
199 parse_compression
200 Internal: Interpret compression.
201
202 See dbfilealter for a list of valid values.
203
204 establish_new_col_mapping
205 internal
206
207 col_create
208 $fsdb->col_create($col_name)
209
210 Add a new column named $COL_NAME to the schema. Returns undef on
211 failure, or 1 if sucessful. (Note: does not return the column index on
212 creation because so that "or" can be used for error checking, given
213 that the column number could be zero.) Also, update the header row to
214 reflect this column (compare to "_internal_col_create").
215
216 _internal_col_create
217 $fsdb->_internal_col_create($col_name)
218
219 For internal "Fsdb::IO" use only. Create a new column $COL_NAME, just
220 like "col_create", but do not update the header row (as that function
221 does).
222
223 field_contains_fs
224 $boolean = $fsdb->field_contains_fs($field);
225
226 Determine if the $FIELD contains $FSDB's fscode (in which case it is
227 malformed).
228
229 fref_contains_fs
230 $boolean = $fsdb->fref_contains_fs($fref);
231
232 Determine if any field in $FREF contains $FSDB's fscode (in which case
233 it is malformed).
234
235 correct_fref_containing_fs
236 $boolean = $fsdb->correct_fref_containing_fs($fref);
237
238 Patch up any field in $FREF contains $FSDB's fscode, as best as
239 possible, but turning the field separator into underscores. Updates
240 $FREF in place, and returns if it was altered. This function looses
241 data.
242
243 fscode
244 $fscode = $fsdb->fscode;
245
246 Returns the fscode of the given database. (The encoded verison
247 representing the field separator.) See also fs to get the actual field
248 separator.
249
250 fs
251 $fscode = $fsdb->fs;
252
253 Returns the field separator. See "fscode" to get the "encoded"
254 version.
255
256 rscode
257 $rscode = $fsdb->rscode;
258
259 Returns the rscode of the given database.
260
261 ncols
262 @fields = $fsdb->ncols;
263
264 Return the number of columns.
265
266 cols
267 $fields_aref = $fsdb->cols;
268
269 Returns the column headings (the field names) of the open database as
270 an aref.
271
272 col_to_i
273 @fields = $fsdb->col_to_i($column_name);
274
275 Returns the column index (0-based) of a given $COLUMN_NAME.
276
277 Note: tests for existence of columns must use "defined", since the
278 index can be 0 which would be interpreted as false.
279
280 i_to_col
281 @fields = $fsdb->i_to_col($column_index);
282
283 Return the name of the COLUMN_INDEX-th (0-based) column.
284
285 fastpath_cancel
286 $fsdb->fastpath_cancel();
287
288 Discard any active fastpath code and allow fastpath-incompatible
289 operations.
290
291 codify
292 ($code, $has_last_refs) = $self->codify($underscored_pseudocode);
293
294 Convert db-code $UNDERSCORED_PSEUDOCODE into perl code in the context
295 of a given Fsdb stream.
296
297 We return a string of code $CODE that refs "@{$fref}" and "@{$lfref}"
298 for the current and prior row arrays, and a flag $HAS_LAST_REFS if
299 "@{$lfref}" is needed. It is the callers job to set these up, probably
300 by evaling the returned string in the context of those variables.n
301
302 The conversion is a rename of all _foo's into database fields. For
303 more perverse needs, _foo(N) means the Nth field after _foo. Also, as
304 of 29-Jan-00, _last_foo gives the last row's value (_last_foo(N) is not
305 supported). To convert we eval $codify_code.
306
307 20-Feb-07: _FROMFILE_foo opens the file called _foo and includes it in
308 place.
309
310 NEEDSWORK: Should make some attempt to catch misspellings of column
311 names.
312
313 clean_potential_columns
314 @clean = Fsdb::IO::clean_potential_columns(@dirty);
315
316 Clean up user-provided column names.
317
318
319
320perl v5.28.1 2016-09-04 Fsdb::IO(3)