1Fsdb::IO(3)           User Contributed Perl Documentation          Fsdb::IO(3)
2
3
4

NAME

6       Fsdb::IO - base class for Fsdb IO (FsdbReader and FsdbWriter)
7

EXAMPLES

9       There are several ways to do IO.  We look at several that compute the
10       product of x and y for this input:
11
12           #fsdb x y product
13           1 10 -
14           2 20 -
15
16       The following routes go from most easy-to-use to least, and also from
17       least efficient to most.  For IO-intensive work, if fastpath takes 1
18       unit of time, then using hashes or arrays takes approximately 2 units
19       of time, all due to CPU overhead.
20
21   Using A Hash
22           use Fsdb::IO::Reader;
23           use Fsdb::IO::Writer;
24
25           # preamble
26           my $out;
27           my $in = new Fsdb::IO::Reader(-file => '-', -comment_handler => \$out)
28               or die "cannot open stdin as fsdb\n";
29           $out = new Fsdb::IO::Writer(-file => '-', -clone => $in)
30               or die "cannot open stdin as fsdb\n";
31
32           # core starts here
33           my %hrow;
34           while ($in->read_row_to_href(\%hrow)) {
35               $hrow{product} = $hrow{x} * $hrow{y};
36               $out->write_row_from_href(\%hrow);
37           };
38
39       It can be convenient to use a hash because one can easily extract
40       fields using hash keys, but hashes can be slow.
41
42   Arrays Instead of Hashes
43       We can add a bit to end of the preamble:
44
45           my $x_i = $in->col_to_i('x') // die "no x column.\n";
46           my $y_i = $in->col_to_i('y') // die "no y column.\n";
47           my $product_i = $in->col_to_i('product') // die "no product column.\n";
48
49       And then replace the core with arrays:
50
51           my @arow;
52           while ($in->read_row_to_aref(\@arow)) {
53               $arow[$product_i] = $arow[$x_i] * $arow[$y_i];
54               $out->write_row_from_aref(\@arow);
55           };
56
57       This code has two advantages over hrefs: First, there is explicit error
58       checking for presence of the expected fields.  Second, arrays are
59       likely a bit faster than hashes.
60
61   Objects Instead of Arrays
62       Keeping the same preamble as for arrays, we can directly get internal
63       Fsdb "row objects" with a new core:
64
65           # core
66           my $rowobj;
67           while ($rowobj = $in->read_rowobj) {
68               if (!ref($rowobj)) {
69                   # comment
70                   &{$in->{_comment_sub}}($rowobj);
71                   next;
72               };
73               $rowobj->[$product_i] = $rowobj->[$x_i] * $rowobj->[$y_i];
74               $out->write_rowobj($rowobj);
75           };
76
77       This code is a bit faster because we just return the internal
78       representation (a rowobj), rather than copy into an array.
79
80       However, unfortunately it doesn't handle comment processing.
81
82   Fastpathing
83       To go really fast, we can build a custom thunk (a chunk of code) that
84       does exactly what we want.  This approach is called a "fastpath".
85
86       It requires a bit more in the preamble (building on the array version):
87
88           my $in_fastpath_sub = $in->fastpath_sub();
89           my $out_fastpath_sub = $out->fastpath_sub();
90
91       And it allows a shorter core (modeled on rowobjs), since the fastpath
92       includes comment processing:
93
94           my $rowobj;
95           while ($rowobj = &$in_fastpath_sub) {
96               $rowobj->[$product_i] = $rowobj->[$x_i] * $rowobj->[$y_i];
97               &$out_fastpath_sub($rowobj);
98           };
99
100       This code is the fastest way to implement this block without evaling
101       code.
102

FUNCTIONS

104   new
105           $fsdb = new Fsdb::IO;
106
107       Creates a new IO object.  Usually you should not create a FsdbIO object
108       directly, but instead create a "FsdbReader" or "FsdbWriter".
109
110       Options:
111
112       -fh FILE_HANDLE Write IO to the given file handle.
113       -header HEADER_LINE Force the header to the given HEADER_LINE (should
114       be verbatim, including #h or whatever). =back
115       -fscode CODE Define just the column (or field) separator fscode part of
116       the header. See dbfilealter for a list of valid field separators.
117       -rscode CODE Define just the row separator part of the header. See
118       dbfilealter for a list of valid row separators.
119       -cols CODE Define just the columns of the header.
120       -compression CODE Define the compression mode for the file that will
121       take effect after the header.
122       -clone $fsdb Copy the stream's configuration from $FSDB, another
123       Fsdb::IO object.
124
125   _reset_cols
126           $fsdb->_reset_cols
127
128       Internal: zero all the mappings in the curren schema.
129
130   _find_filename_decompressor
131       returns the name of the decompression program for FILE if it ends in a
132       compression extension
133
134   config_one
135           $fsdb->config_one($arglist_aref);
136
137       Parse the first configuration option on the list, removing it.
138
139       Options are listed in new.
140
141   config
142           $fsdb->config(-arg1 => $value1, -arg2 => $value2);
143
144       Parse all options in the list.
145
146   default_binmode
147           $fsdb->default_binmode();
148
149       Set the file to the correct binmode, either given by "-encoding" at
150       setup, or defaulting from "LC_CTYPE" or "LANG".
151
152       If the file is compressed, we will reset binmode after reading the
153       header.
154
155   compare
156           $result = $fsdb->compare($other_fsdb)
157
158       Compares two Fsdb::IO objects, returning the strings "identical" (same
159       field separator, columns, and column order), or maybe "compatible"
160       (same field separator but different columns), or undef if they differ.
161
162   close
163           $fsdb->close;
164
165       Closes the file, frees open file handle, or sends an EOF signal (and
166       undef) down the open queue.
167
168   error
169           $fsdb->error;
170
171       Returns a descriptive string if there is an error, or undef if not.
172
173       The string will never end in a newline or punctuation.
174
175   update_v1_headerrow
176       internal: create the header the internal schema
177
178   parse_v1_headerrow
179       internal: interpet the header
180
181   update_headerrow
182       internal: create the header the internal schema
183
184   parse_headerrow
185       internal: interpet the v2 header.  Format is:
186
187           #fsdb [-F x] [-R x] [-Z x] columns
188
189       All options must come first, start with dashes, and have an argument.
190       (More regular than the v1 header.)
191
192       Columns have optional :t type specifiers.
193
194   parse_v1_fscode
195       internal
196
197   parse_fscode
198       Parse the field separator.  See dbfilealter for a list of valid values.
199
200   parse_rscode
201       Internal: Interpret rscodes.
202
203       See dbfilealter for a list of valid values.
204
205   parse_compression
206       Internal: Interpret compression.
207
208       See dbfilealter for a list of valid values.
209
210   establish_new_col_mapping
211       internal
212
213   col_create
214           $fsdb->col_create($col_name)
215
216       Add a new column named $COL_NAME to the schema.  Returns undef on
217       failure, or 1 if sucessful.  (Note: does not return the column index on
218       creation because so that "or" can be used for error checking, given
219       that the column number could be zero.)  Also, update the header row to
220       reflect this column (compare to "_internal_col_create").
221
222   colspec_to_name_type_spec
223           ($name, $type, $type_speced) = $fsdb->colspec_to_name_type($colspec)
224
225       Split a colspec into a name, type, and the type as specified (which may
226       be null if no type was given).
227
228   _internal_col_create
229           $fsdb->_internal_col_create($colspec)
230
231       For internal "Fsdb::IO" use only.  Create a new column $COL_NAME, just
232       like "col_create", but do not update the header row (as that function
233       does).
234
235   field_contains_fs
236           $boolean = $fsdb->field_contains_fs($field);
237
238       Determine if the $FIELD contains $FSDB's fscode (in which case it is
239       malformed).
240
241   fref_contains_fs
242           $boolean = $fsdb->fref_contains_fs($fref);
243
244       Determine if any field in $FREF contains $FSDB's fscode (in which case
245       it is malformed).
246
247   correct_fref_containing_fs
248           $boolean = $fsdb->correct_fref_containing_fs($fref);
249
250       Patch up any field in $FREF contains $FSDB's fscode, as best as
251       possible, but turning the field separator into underscores.  Updates
252       $FREF in place, and returns if it was altered.  This function looses
253       data.
254
255   fscode
256           $fscode = $fsdb->fscode;
257
258       Returns the fscode of the given database.  (The encoded verison
259       representing the field separator.)  See also fs to get the actual field
260       separator.
261
262   fs
263           $fscode = $fsdb->fs;
264
265       Returns the field separator.  See "fscode" to get the "encoded"
266       version.
267
268   rscode
269           $rscode = $fsdb->rscode;
270
271       Returns the rscode of the given database.
272
273   ncols
274           @fields = $fsdb->ncols;
275
276       Return the number of columns.
277
278   cols
279           $fields_aref = $fsdb->cols;
280
281       Returns the column names (the field names, without type specifications)
282       of the open database as an aref.
283
284   colspecs
285           $fields_aref = $fsdb->colspecs();
286
287       Returns the column headings (the field names) of the open database as
288       an aref.
289
290   col_to_i
291           $coli = $fsdb->col_to_i($column_name);
292
293       Returns the column index (0-based) of a given $COLUMN_NAME.  (Names
294       cannot have types with them.)
295
296       Note: tests for existence of columns must use "defined", since the
297       index can be 0 which would be interpreted as false.
298
299   colspec_to_i
300           $coli = $fsdb->colspec_to_i($column_specification);
301
302       Returns the column index (0-based) of a given $COLUMN_NAME.  Name may
303       or may not include a type.
304
305       Note: tests for existence of columns must use "defined", since the
306       index can be 0 which would be interpreted as false.
307
308   col_to_name
309           @fields = $fsdb->col_to_name($column_name);
310
311       Returns the column anme a given $COLUMN_NAME_OR_INDEX.
312
313   col_to_type
314           @fields = $fsdb->col_to_type($column_name, $force_type);
315
316       Returns the column type (and undef if type is not required, unless
317       $FORCE_TYPE) of a given $COLUMN_NAME.
318
319   col_to_colspec
320           @fields = $fsdb->col_to_colspec($column_name, $force_type);
321
322       Returns the column specification (type is optional, unless $FORCE_TYPE)
323       of a given $COLUMN_NAME.
324
325   col_type_is_numeric
326           @fields = $fsdb->col_type_is_numeric($column_name);
327
328       Returns non-zero if column specification is numeric.  (Actually,
329       returns 1 for integers and 2 for floats.)
330
331   i_to_col
332           @fields = $fsdb->i_to_col($column_index);
333
334       Return the name of the COLUMN_INDEX-th (0-based) column.
335
336   fastpath_cancel
337           $fsdb->fastpath_cancel();
338
339       Discard any active fastpath code and allow fastpath-incompatible
340       operations.
341
342   codify
343           ($code, $has_last_refs) = $self->codify($underscored_pseudocode);
344
345       Convert db-code $UNDERSCORED_PSEUDOCODE into perl code in the context
346       of a given Fsdb stream.
347
348       We return a string of code $CODE that refs "@{$fref}" and "@{$lfref}"
349       for the current and prior row arrays, and a flag $HAS_LAST_REFS if
350       "@{$lfref}" is needed.  It is the callers job to set these up, probably
351       by evaling the returned string in the context of those variables.n
352
353       The conversion is a rename of all _foo's into database fields.  For
354       more perverse needs, _foo(N) means the Nth field after _foo.  Also, as
355       of 29-Jan-00, _last_foo gives the last row's value (_last_foo(N) is not
356       supported).  To convert we eval $codify_code.
357
358       20-Feb-07: _FROMFILE_foo opens the file called _foo and includes it in
359       place.
360
361       NEEDSWORK:  Should make some attempt to catch misspellings of column
362       names.
363
364   clean_potential_columns
365           @clean = Fsdb::IO::clean_potential_columns(@dirty);
366
367       Clean up user-provided column names.
368
369
370
371perl v5.38.0                      2023-07-20                       Fsdb::IO(3)
Impressum