1Fsdb::IO(3) User Contributed Perl Documentation Fsdb::IO(3)
2
3
4
6 Fsdb::IO - base class for Fsdb IO (FsdbReader and FsdbWriter)
7
9 There are several ways to do IO. We look at several that compute the
10 product of x and y for this input:
11
12 #fsdb x y product
13 1 10 -
14 2 20 -
15
16 The following routes go from most easy-to-use to least, and also from
17 least efficient to most. For IO-intensive work, if fastpath takes 1
18 unit of time, then using hashes or arrays takes approximately 2 units
19 of time, all due to CPU overhead.
20
21 Using A Hash
22 use Fsdb::IO::Reader;
23 use Fsdb::IO::Writer;
24
25 # preamble
26 my $out;
27 my $in = new Fsdb::IO::Reader(-file => '-', -comment_handler => \$out)
28 or die "cannot open stdin as fsdb\n";
29 $out = new Fsdb::IO::Writer(-file => '-', -clone => $in)
30 or die "cannot open stdin as fsdb\n";
31
32 # core starts here
33 my %hrow;
34 while ($in->read_row_to_href(\%hrow)) {
35 $hrow{product} = $hrow{x} * $hrow{y};
36 $out->write_row_from_href(\%hrow);
37 };
38
39 It can be convenient to use a hash because one can easily extract
40 fields using hash keys, but hashes can be slow.
41
42 Arrays Instead of Hashes
43 We can add a bit to end of the preamble:
44
45 my $x_i = $in->col_to_i('x') // die "no x column.\n";
46 my $y_i = $in->col_to_i('y') // die "no y column.\n";
47 my $product_i = $in->col_to_i('product') // die "no product column.\n";
48
49 And then replace the core with arrays:
50
51 my @arow;
52 while ($in->read_row_to_aref(\@arow)) {
53 $arow[$product_i] = $arow[$x_i] * $arow[$y_i];
54 $out->write_row_from_aref(\@arow);
55 };
56
57 This code has two advantages over hrefs: First, there is explicit error
58 checking for presence of the expected fields. Second, arrays are
59 likely a bit faster than hashes.
60
61 Objects Instead of Arrays
62 Keeping the same preamble as for arrays, we can directly get internal
63 Fsdb "row objects" with a new core:
64
65 # core
66 my $rowobj;
67 while ($rowobj = $in->read_rowobj) {
68 if (!ref($rowobj)) {
69 # comment
70 &{$in->{_comment_sub}}($rowobj);
71 next;
72 };
73 $rowobj->[$product_i] = $rowobj->[$x_i] * $rowobj->[$y_i];
74 $out->write_rowobj($rowobj);
75 };
76
77 This code is a bit faster because we just return the internal
78 representation (a rowobj), rather than copy into an array.
79
80 However, unfortunately it doesn't handle comment processing.
81
82 Fastpathing
83 To go really fast, we can build a custom thunk (a chunk of code) that
84 does exactly what we want. This approach is called a "fastpath".
85
86 It requires a bit more in the preamble (building on the array version):
87
88 my $in_fastpath_sub = $in->fastpath_sub();
89 my $out_fastpath_sub = $out->fastpath_sub();
90
91 And it allows a shorter core (modeled on rowobjs), since the fastpath
92 includes comment processing:
93
94 my $rowobj;
95 while ($rowobj = &$in_fastpath_sub) {
96 $rowobj->[$product_i] = $rowobj->[$x_i] * $rowobj->[$y_i];
97 &$out_fastpath_sub($rowobj);
98 };
99
100 This code is the fastest way to implement this block without evaling
101 code.
102
104 new
105 $fsdb = new Fsdb::IO;
106
107 Creates a new IO object. Usually you should not create a FsdbIO object
108 directly, but instead create a "FsdbReader" or "FsdbWriter".
109
110 Options:
111
112 -fh FILE_HANDLE Write IO to the given file handle.
113 -header HEADER_LINE Force the header to the given HEADER_LINE (should
114 be verbatim, including #h or whatever). =back
115 -fscode CODE Define just the column (or field) separator fscode part of
116 the header. See dbfilealter for a list of valid field separators.
117 -rscode CODE Define just the row separator part of the header. See
118 dbfilealter for a list of valid row separators.
119 -cols CODE Define just the columns of the header.
120 -compression CODE Define the compression mode for the file that will
121 take effect after the header.
122 -clone $fsdb Copy the stream's configuration from $FSDB, another
123 Fsdb::IO object.
124
125 _reset_cols
126 $fsdb->_reset_cols
127
128 Internal: zero all the mappings in the curren schema.
129
130 _find_filename_decompressor
131 returns the name of the decompression program for FILE if it ends in a
132 compression extension
133
134 config_one
135 $fsdb->config_one($arglist_aref);
136
137 Parse the first configuration option on the list, removing it.
138
139 Options are listed in new.
140
141 config
142 $fsdb->config(-arg1 => $value1, -arg2 => $value2);
143
144 Parse all options in the list.
145
146 default_binmode
147 $fsdb->default_binmode();
148
149 Set the file to the correct binmode, either given by "-encoding" at
150 setup, or defaulting from "LC_CTYPE" or "LANG".
151
152 If the file is compressed, we will reset binmode after reading the
153 header.
154
155 compare
156 $result = $fsdb->compare($other_fsdb)
157
158 Compares two Fsdb::IO objects, returning the strings "identical" (same
159 field separator, columns, and column order), or maybe "compatible"
160 (same field separator but different columns), or undef if they differ.
161
162 close
163 $fsdb->close;
164
165 Closes the file, frees open file handle, or sends an EOF signal (and
166 undef) down the open queue.
167
168 error
169 $fsdb->error;
170
171 Returns a descriptive string if there is an error, or undef if not.
172
173 The string will never end in a newline or punctuation.
174
175 update_v1_headerrow
176 internal: create the header the internal schema
177
178 parse_v1_headerrow
179 internal: interpet the header
180
181 update_headerrow
182 internal: create the header the internal schema
183
184 parse_headerrow
185 internal: interpet the v2 header. Format is:
186
187 #fsdb [-F x] [-R x] [-Z x] columns
188
189 All options must come first, start with dashes, and have an argument.
190 (More regular than the v1 header.)
191
192 Columns have optional :t type specifiers.
193
194 parse_v1_fscode
195 internal
196
197 parse_fscode
198 Parse the field separator. See dbfilealter for a list of valid values.
199
200 parse_rscode
201 Internal: Interpret rscodes.
202
203 See dbfilealter for a list of valid values.
204
205 parse_compression
206 Internal: Interpret compression.
207
208 See dbfilealter for a list of valid values.
209
210 establish_new_col_mapping
211 internal
212
213 col_create
214 $fsdb->col_create($col_name)
215
216 Add a new column named $COL_NAME to the schema. Returns undef on
217 failure, or 1 if sucessful. (Note: does not return the column index on
218 creation because so that "or" can be used for error checking, given
219 that the column number could be zero.) Also, update the header row to
220 reflect this column (compare to "_internal_col_create").
221
222 colspec_to_name_type_spec
223 ($name, $type, $type_speced) = $fsdb->colspec_to_name_type($colspec)
224
225 Split a colspec into a name, type, and the type as specified (which may
226 be null if no type was given).
227
228 _internal_col_create
229 $fsdb->_internal_col_create($colspec)
230
231 For internal "Fsdb::IO" use only. Create a new column $COL_NAME, just
232 like "col_create", but do not update the header row (as that function
233 does).
234
235 field_contains_fs
236 $boolean = $fsdb->field_contains_fs($field);
237
238 Determine if the $FIELD contains $FSDB's fscode (in which case it is
239 malformed).
240
241 fref_contains_fs
242 $boolean = $fsdb->fref_contains_fs($fref);
243
244 Determine if any field in $FREF contains $FSDB's fscode (in which case
245 it is malformed).
246
247 correct_fref_containing_fs
248 $boolean = $fsdb->correct_fref_containing_fs($fref);
249
250 Patch up any field in $FREF contains $FSDB's fscode, as best as
251 possible, but turning the field separator into underscores. Updates
252 $FREF in place, and returns if it was altered. This function looses
253 data.
254
255 fscode
256 $fscode = $fsdb->fscode;
257
258 Returns the fscode of the given database. (The encoded verison
259 representing the field separator.) See also fs to get the actual field
260 separator.
261
262 fs
263 $fscode = $fsdb->fs;
264
265 Returns the field separator. See "fscode" to get the "encoded"
266 version.
267
268 rscode
269 $rscode = $fsdb->rscode;
270
271 Returns the rscode of the given database.
272
273 ncols
274 @fields = $fsdb->ncols;
275
276 Return the number of columns.
277
278 cols
279 $fields_aref = $fsdb->cols;
280
281 Returns the column names (the field names, without type specifications)
282 of the open database as an aref.
283
284 colspecs
285 $fields_aref = $fsdb->colspecs();
286
287 Returns the column headings (the field names) of the open database as
288 an aref.
289
290 col_to_i
291 $coli = $fsdb->col_to_i($column_name);
292
293 Returns the column index (0-based) of a given $COLUMN_NAME. (Names
294 cannot have types with them.)
295
296 Note: tests for existence of columns must use "defined", since the
297 index can be 0 which would be interpreted as false.
298
299 colspec_to_i
300 $coli = $fsdb->colspec_to_i($column_specification);
301
302 Returns the column index (0-based) of a given $COLUMN_NAME. Name may
303 or may not include a type.
304
305 Note: tests for existence of columns must use "defined", since the
306 index can be 0 which would be interpreted as false.
307
308 col_to_name
309 @fields = $fsdb->col_to_name($column_name);
310
311 Returns the column anme a given $COLUMN_NAME_OR_INDEX.
312
313 col_to_type
314 @fields = $fsdb->col_to_type($column_name, $force_type);
315
316 Returns the column type (and undef if type is not required, unless
317 $FORCE_TYPE) of a given $COLUMN_NAME.
318
319 col_to_colspec
320 @fields = $fsdb->col_to_colspec($column_name, $force_type);
321
322 Returns the column specification (type is optional, unless $FORCE_TYPE)
323 of a given $COLUMN_NAME.
324
325 col_type_is_numeric
326 @fields = $fsdb->col_type_is_numeric($column_name);
327
328 Returns non-zero if column specification is numeric. (Actually,
329 returns 1 for integers and 2 for floats.)
330
331 i_to_col
332 @fields = $fsdb->i_to_col($column_index);
333
334 Return the name of the COLUMN_INDEX-th (0-based) column.
335
336 fastpath_cancel
337 $fsdb->fastpath_cancel();
338
339 Discard any active fastpath code and allow fastpath-incompatible
340 operations.
341
342 codify
343 ($code, $has_last_refs) = $self->codify($underscored_pseudocode);
344
345 Convert db-code $UNDERSCORED_PSEUDOCODE into perl code in the context
346 of a given Fsdb stream.
347
348 We return a string of code $CODE that refs "@{$fref}" and "@{$lfref}"
349 for the current and prior row arrays, and a flag $HAS_LAST_REFS if
350 "@{$lfref}" is needed. It is the callers job to set these up, probably
351 by evaling the returned string in the context of those variables.n
352
353 The conversion is a rename of all _foo's into database fields. For
354 more perverse needs, _foo(N) means the Nth field after _foo. Also, as
355 of 29-Jan-00, _last_foo gives the last row's value (_last_foo(N) is not
356 supported). To convert we eval $codify_code.
357
358 20-Feb-07: _FROMFILE_foo opens the file called _foo and includes it in
359 place.
360
361 NEEDSWORK: Should make some attempt to catch misspellings of column
362 names.
363
364 clean_potential_columns
365 @clean = Fsdb::IO::clean_potential_columns(@dirty);
366
367 Clean up user-provided column names.
368
369
370
371perl v5.38.0 2023-07-20 Fsdb::IO(3)