1Fsdb::IO(3) User Contributed Perl Documentation Fsdb::IO(3)
2
3
4
6 Fsdb::IO - base class for Fsdb IO (FsdbReader and FsdbWriter)
7
9 There are several ways to do IO. We look at several that compute the
10 product of x and y for this input:
11
12 #fsdb x y product
13 1 10 -
14 2 20 -
15
16 The following routes go from most easy-to-use to least, and also from
17 least efficient to most. For IO-intensive work, if fastpath takes 1
18 unit of time, then using hashes or arrays takes approximately 2 units
19 of time, all due to CPU overhead.
20
21 Using A Hash
22 use Fsdb::IO::Reader;
23 use Fsdb::IO::Writer;
24
25 # preamble
26 my $out;
27 my $in = new Fsdb::IO::Reader(-file => '-', -comment_handler => \$out)
28 or die "cannot open stdin as fsdb\n";
29 $out = new Fsdb::IO::Writer(-file => '-', -clone => $in)
30 or die "cannot open stdin as fsdb\n";
31
32 # core starts here
33 my %hrow;
34 while ($in->read_row_to_href(\%hrow)) {
35 $hrow{product} = $hrow{x} * $hrow{y};
36 $out->write_row_from_href(\%hrow);
37 };
38
39 It can be convenient to use a hash because one can easily extract
40 fields using hash keys, but hashes can be slow.
41
42 Arrays Instead of Hashes
43 We can add a bit to end of the preamble:
44
45 my $x_i = $in->col_to_i('x') // die "no x column.\n";
46 my $y_i = $in->col_to_i('y') // die "no y column.\n";
47 my $product_i = $in->col_to_i('product') // die "no product column.\n";
48
49 And then replace the core with arrays:
50
51 my @arow;
52 while ($in->read_row_to_aref(\@arow)) {
53 $arow[$product_i] = $arow[$x_i] * $arow[$y_i];
54 $out->write_row_from_aref(\@arow);
55 };
56
57 This code has two advantages over hrefs: First, there is explicit error
58 checking for presence of the expected fields. Second, arrays are
59 likely a bit faster than hashes.
60
61 Objects Instead of Arrays
62 Keeping the same preamble as for arrays, we can directly get internal
63 Fsdb "row objects" with a new core:
64
65 # core
66 my $rowobj;
67 while ($rowobj = $in->read_rowobj) {
68 if (!ref($rowobj)) {
69 # comment
70 &{$in->{_comment_sub}}($rowobj);
71 next;
72 };
73 $rowobj->[$product_i] = $rowobj->[$x_i] * $rowobj->[$y_i];
74 $out->write_rowobj($rowobj);
75 };
76
77 This code is a bit faster because we just return the internal
78 representation (a rowobj), rather than copy into an array.
79
80 However, unfortunately it doesn't handle comment processing.
81
82 Fastpathing
83 To go really fast, we can build a custom thunk (a chunk of code) that
84 does exactly what we want. This approach is called a "fastpath".
85
86 It requires a bit more in the preamble (building on the array version):
87
88 my $in_fastpath_sub = $in->fastpath_sub();
89 my $out_fastpath_sub = $out->fastpath_sub();
90
91 And it allows a shorter core (modeled on rowobjs), since the fastpath
92 includes comment processing:
93
94 my $rowobj;
95 while ($rowobj = &$in_fastpath_sub) {
96 $rowobj->[$product_i] = $rowobj->[$x_i] * $rowobj->[$y_i];
97 &$out_fastpath_sub($rowobj);
98 };
99
100 This code is the fastest way to implement this block without evaling
101 code.
102
104 new
105 $fsdb = new Fsdb::IO;
106
107 Creates a new IO object. Usually you should not create a FsdbIO object
108 directly, but instead create a "FsdbReader" or "FsdbWriter".
109
110 Options:
111
112 -fh FILE_HANDLE Write IO to the given file handle.
113 -header HEADER_LINE Force the header to the given HEADER_LINE (should
114 be verbatim, including #h or whatever). =back
115 -fscode CODE Define just the column (or field) separator fscode part of
116 the header. See dbfilealter for a list of valid field separators.
117 -rscode CODE Define just the row separator part of the header. See
118 dbfilealter for a list of valid row separators.
119 -cols CODE Define just the columns of the header.
120 -compression CODE Define the compression mode for the file that will
121 take effect after the header.
122 -clone $fsdb Copy the stream's configuration from $FSDB, another
123 Fsdb::IO object.
124
125 _reset_cols
126 $fsdb->_reset_cols
127
128 Internal: zero all the mappings in the curren schema.
129
130 _find_filename_decompressor
131 returns the name of the decompression program for FILE if it ends in a
132 compression extension
133
134 config_one
135 $fsdb->config_one($arglist_aref);
136
137 Parse the first configuration option on the list, removing it.
138
139 Options are listed in new.
140
141 config
142 $fsdb->config(-arg1 => $value1, -arg2 => $value2);
143
144 Parse all options in the list.
145
146 default_binmode
147 $fsdb->default_binmode();
148
149 Set the file to the correct binmode, either given by "-encoding" at
150 setup, or defaulting from "LC_CTYPE" or "LANG".
151
152 If the file is compressed, we will reset binmode after reading the
153 header.
154
155 compare
156 $result = $fsdb->compare($other_fsdb)
157
158 Compares two Fsdb::IO objects, returning the strings "identical" (same
159 field separator, columns, and column order), or maybe "compatible"
160 (same field separator but different columns), or undef if they differ.
161
162 close
163 $fsdb->close;
164
165 Closes the file, frees open file handle, or sends an EOF signal (and
166 undef) down the open queue.
167
168 error
169 $fsdb->error;
170
171 Returns a descriptive string if there is an error, or undef if not.
172
173 The string will never end in a newline or punctuation.
174
175 update_v1_headerrow
176 internal: create the header the internal schema
177
178 parse_v1_headerrow
179 internal: interpet the header
180
181 update_headerrow
182 internal: create the header the internal schema
183
184 parse_headerrow
185 internal: interpet the v2 header. Format is:
186
187 #fsdb [-F x] [-R x] [-Z x] columns
188
189 All options must come first, start with dashes, and have an argument.
190 (More regular than the v1 header.)
191
192 Columns have optional :t type specifiers.
193
194 parse_v1_fscode
195 internal
196
197 parse_fscode
198 Parse the field separator. See dbfilealter for a list of valid values.
199
200 parse_rscode
201 Internal: Interpret rscodes.
202
203 See dbfilealter for a list of valid values.
204
205 parse_compression
206 Internal: Interpret compression.
207
208 See dbfilealter for a list of valid values.
209
210 establish_new_col_mapping
211 internal
212
213 col_create
214 $fsdb->col_create($col_name)
215
216 Add a new column named $COL_NAME to the schema. Returns undef on
217 failure, or 1 if sucessful. (Note: does not return the column index on
218 creation because so that "or" can be used for error checking, given
219 that the column number could be zero.) Also, update the header row to
220 reflect this column (compare to "_internal_col_create").
221
222 _internal_colspec_to_name_type
223 ($name, $type, $type_speced) = $fsdb->_internal_colspec_to_name_type($colspec)
224
225 Split a colspec into a name, type, and the type as specified (which may
226 be null if no type was given).
227
228 _internal_col_create
229 $fsdb->_internal_col_create($colspec)
230
231 For internal "Fsdb::IO" use only. Create a new column $COL_NAME, just
232 like "col_create", but do not update the header row (as that function
233 does).
234
235 field_contains_fs
236 $boolean = $fsdb->field_contains_fs($field);
237
238 Determine if the $FIELD contains $FSDB's fscode (in which case it is
239 malformed).
240
241 fref_contains_fs
242 $boolean = $fsdb->fref_contains_fs($fref);
243
244 Determine if any field in $FREF contains $FSDB's fscode (in which case
245 it is malformed).
246
247 correct_fref_containing_fs
248 $boolean = $fsdb->correct_fref_containing_fs($fref);
249
250 Patch up any field in $FREF contains $FSDB's fscode, as best as
251 possible, but turning the field separator into underscores. Updates
252 $FREF in place, and returns if it was altered. This function looses
253 data.
254
255 fscode
256 $fscode = $fsdb->fscode;
257
258 Returns the fscode of the given database. (The encoded verison
259 representing the field separator.) See also fs to get the actual field
260 separator.
261
262 fs
263 $fscode = $fsdb->fs;
264
265 Returns the field separator. See "fscode" to get the "encoded"
266 version.
267
268 rscode
269 $rscode = $fsdb->rscode;
270
271 Returns the rscode of the given database.
272
273 ncols
274 @fields = $fsdb->ncols;
275
276 Return the number of columns.
277
278 cols
279 $fields_aref = $fsdb->cols;
280
281 Returns the column names (the field names, without type specifications)
282 of the open database as an aref.
283
284 colspecs
285 $fields_aref = $fsdb->colspecs();
286
287 Returns the column headings (the field names) of the open database as
288 an aref.
289
290 col_to_i
291 @fields = $fsdb->col_to_i($column_name);
292
293 Returns the column index (0-based) of a given $COLUMN_NAME.
294
295 Note: tests for existence of columns must use "defined", since the
296 index can be 0 which would be interpreted as false.
297
298 col_to_name
299 @fields = $fsdb->col_to_name($column_name);
300
301 Returns the column anme a given $COLUMN_NAME_OR_INDEX.
302
303 col_to_type
304 @fields = $fsdb->col_to_type($column_name, $force_type);
305
306 Returns the column type (and undef if type is not required, unless
307 $FORCE_TYPE) of a given $COLUMN_NAME.
308
309 col_to_colspec
310 @fields = $fsdb->col_to_colspec($column_name, $force_type);
311
312 Returns the column specification (type is optional, unless $FORCE_TYPE)
313 of a given $COLUMN_NAME.
314
315 col_type_is_numeric
316 @fields = $fsdb->col_type_is_numeric($column_name);
317
318 Returns non-zero if column specification is numeric. (Actually,
319 returns 1 for integers and 2 for floats.)
320
321 i_to_col
322 @fields = $fsdb->i_to_col($column_index);
323
324 Return the name of the COLUMN_INDEX-th (0-based) column.
325
326 fastpath_cancel
327 $fsdb->fastpath_cancel();
328
329 Discard any active fastpath code and allow fastpath-incompatible
330 operations.
331
332 codify
333 ($code, $has_last_refs) = $self->codify($underscored_pseudocode);
334
335 Convert db-code $UNDERSCORED_PSEUDOCODE into perl code in the context
336 of a given Fsdb stream.
337
338 We return a string of code $CODE that refs "@{$fref}" and "@{$lfref}"
339 for the current and prior row arrays, and a flag $HAS_LAST_REFS if
340 "@{$lfref}" is needed. It is the callers job to set these up, probably
341 by evaling the returned string in the context of those variables.n
342
343 The conversion is a rename of all _foo's into database fields. For
344 more perverse needs, _foo(N) means the Nth field after _foo. Also, as
345 of 29-Jan-00, _last_foo gives the last row's value (_last_foo(N) is not
346 supported). To convert we eval $codify_code.
347
348 20-Feb-07: _FROMFILE_foo opens the file called _foo and includes it in
349 place.
350
351 NEEDSWORK: Should make some attempt to catch misspellings of column
352 names.
353
354 clean_potential_columns
355 @clean = Fsdb::IO::clean_potential_columns(@dirty);
356
357 Clean up user-provided column names.
358
359
360
361perl v5.34.1 2022-04-04 Fsdb::IO(3)