1Fsdb::IO(3)           User Contributed Perl Documentation          Fsdb::IO(3)
2
3
4

NAME

6       Fsdb::IO - base class for Fsdb IO (FsdbReader and FsdbWriter)
7

EXAMPLES

9       There are several ways to do IO.  We look at several that compute the
10       product of x and y for this input:
11
12           #fsdb x y product
13           1 10 -
14           2 20 -
15
16       The following routes go from most easy-to-use to least, and also from
17       least efficient to most.  For IO-intensive work, if fastpath takes 1
18       unit of time, then using hashes or arrays takes approximately 2 units
19       of time, all due to CPU overhead.
20
21   Using A Hash
22           use Fsdb::IO::Reader;
23           use Fsdb::IO::Writer;
24
25           # preamble
26           my $out;
27           my $in = new Fsdb::IO::Reader(-file => '-', -comment_handler => \$out)
28               or die "cannot open stdin as fsdb\n";
29           $out = new Fsdb::IO::Writer(-file => '-', -clone => $in)
30               or die "cannot open stdin as fsdb\n";
31
32           # core starts here
33           my %hrow;
34           while ($in->read_row_to_href(\%hrow)) {
35               $hrow{product} = $hrow{x} * $hrow{y};
36               $out->write_row_from_href(\%hrow);
37           };
38
39       It can be convenient to use a hash because one can easily extract
40       fields using hash keys, but hashes can be slow.
41
42   Arrays Instead of Hashes
43       We can add a bit to end of the preamble:
44
45           my $x_i = $in->col_to_i('x') // die "no x column.\n";
46           my $y_i = $in->col_to_i('y') // die "no y column.\n";
47           my $product_i = $in->col_to_i('product') // die "no product column.\n";
48
49       And then replace the core with arrays:
50
51           my @arow;
52           while ($in->read_row_to_aref(\@arow)) {
53               $arow[$product_i] = $arow[$x_i] * $arow[$y_i];
54               $out->write_row_from_aref(\@arow);
55           };
56
57       This code has two advantages over hrefs: First, there is explicit error
58       checking for presence of the expected fields.  Second, arrays are
59       likely a bit faster than hashes.
60
61   Objects Instead of Arrays
62       Keeping the same preamble as for arrays, we can directly get internal
63       Fsdb "row objects" with a new core:
64
65           # core
66           my $rowobj;
67           while ($rowobj = $in->read_rowobj) {
68               if (!ref($rowobj)) {
69                   # comment
70                   &{$in->{_comment_sub}}($rowobj);
71                   next;
72               };
73               $rowobj->[$product_i] = $rowobj->[$x_i] * $rowobj->[$y_i];
74               $out->write_rowobj($rowobj);
75           };
76
77       This code is a bit faster because we just return the internal
78       representation (a rowobj), rather than copy into an array.
79
80       However, unfortunately it doesn't handle comment processing.
81
82   Fastpathing
83       To go really fast, we can build a custom thunk (a chunk of code) that
84       does exactly what we want.  This approach is called a "fastpath".
85
86       It requires a bit more in the preamble (building on the array version):
87
88           my $in_fastpath_sub = $in->fastpath_sub();
89           my $out_fastpath_sub = $out->fastpath_sub();
90
91       And it allows a shorter core (modeled on rowobjs), since the fastpath
92       includes comment processing:
93
94           my $rowobj;
95           while ($rowobj = &$in_fastpath_sub) {
96               $rowobj->[$product_i] = $rowobj->[$x_i] * $rowobj->[$y_i];
97               &$out_fastpath_sub($rowobj);
98           };
99
100       This code is the fastest way to implement this block without evaling
101       code.
102

FUNCTIONS

104   new
105           $fsdb = new Fsdb::IO;
106
107       Creates a new IO object.  Usually you should not create a FsdbIO object
108       directly, but instead create a "FsdbReader" or "FsdbWriter".
109
110       Options:
111
112       -fh FILE_HANDLE Write IO to the given file handle.
113       -header HEADER_LINE Force the header to the given HEADER_LINE (should
114       be verbatim, including #h or whatever). =back
115       -fscode CODE Define just the column (or field) separator fscode part of
116       the header. See dbfilealter for a list of valid field separators.
117       -rscode CODE Define just the row separator part of the header. See
118       dbfilealter for a list of valid row separators.
119       -cols CODE Define just the columns of the header.
120       -compression CODE Define the compression mode for the file that will
121       take effect after the header.
122       -clone $fsdb Copy the stream's configuration from $FSDB, another
123       Fsdb::IO object.
124
125   _reset_cols
126           $fsdb->_reset_cols
127
128       Internal: zero all the mappings in the curren schema.
129
130   _find_filename_decompressor
131       returns the name of the decompression program for FILE if it ends in a
132       compression extension
133
134   config_one
135           $fsdb->config_one($arglist_aref);
136
137       Parse the first configuration option on the list, removing it.
138
139       Options are listed in new.
140
141   config
142           $fsdb->config(-arg1 => $value1, -arg2 => $value2);
143
144       Parse all options in the list.
145
146   default_binmode
147           $fsdb->default_binmode();
148
149       Set the file to the correct binmode, either given by "-encoding" at
150       setup, or defaulting from "LC_CTYPE" or "LANG".
151
152       If the file is compressed, we will reset binmode after reading the
153       header.
154
155   compare
156           $result = $fsdb->compare($other_fsdb)
157
158       Compares two Fsdb::IO objects, returning the strings "identical" (same
159       field separator, columns, and column order), or maybe "compatible"
160       (same field separator but different columns), or undef if they differ.
161
162   close
163           $fsdb->close;
164
165       Closes the file, frees open file handle, or sends an EOF signal (and
166       undef) down the open queue.
167
168   error
169           $fsdb->error;
170
171       Returns a descriptive string if there is an error, or undef if not.
172
173       The string will never end in a newline or punctuation.
174
175   update_v1_headerrow
176       internal: create the header the internal schema
177
178   parse_v1_headerrow
179       internal: interpet the header
180
181   update_headerrow
182       internal: create the header the internal schema
183
184   parse_headerrow
185       internal: interpet the v2 header.  Format is:
186
187           #fsdb [-F x] [-R x] [-Z x] columns
188
189       All options must come first, start with dashes, and have an argument.
190       (More regular than the v1 header.)
191
192   parse_v1_fscode
193       internal
194
195   parse_fscode
196       Parse the field separator.  See dbfilealter for a list of valid values.
197
198   parse_rscode
199       Internal: Interpret rscodes.
200
201       See dbfilealter for a list of valid values.
202
203   parse_compression
204       Internal: Interpret compression.
205
206       See dbfilealter for a list of valid values.
207
208   establish_new_col_mapping
209       internal
210
211   col_create
212           $fsdb->col_create($col_name)
213
214       Add a new column named $COL_NAME to the schema.  Returns undef on
215       failure, or 1 if sucessful.  (Note: does not return the column index on
216       creation because so that "or" can be used for error checking, given
217       that the column number could be zero.)  Also, update the header row to
218       reflect this column (compare to "_internal_col_create").
219
220   _internal_col_create
221           $fsdb->_internal_col_create($col_name)
222
223       For internal "Fsdb::IO" use only.  Create a new column $COL_NAME, just
224       like "col_create", but do not update the header row (as that function
225       does).
226
227   field_contains_fs
228           $boolean = $fsdb->field_contains_fs($field);
229
230       Determine if the $FIELD contains $FSDB's fscode (in which case it is
231       malformed).
232
233   fref_contains_fs
234           $boolean = $fsdb->fref_contains_fs($fref);
235
236       Determine if any field in $FREF contains $FSDB's fscode (in which case
237       it is malformed).
238
239   correct_fref_containing_fs
240           $boolean = $fsdb->correct_fref_containing_fs($fref);
241
242       Patch up any field in $FREF contains $FSDB's fscode, as best as
243       possible, but turning the field separator into underscores.  Updates
244       $FREF in place, and returns if it was altered.  This function looses
245       data.
246
247   fscode
248           $fscode = $fsdb->fscode;
249
250       Returns the fscode of the given database.  (The encoded verison
251       representing the field separator.)  See also fs to get the actual field
252       separator.
253
254   fs
255           $fscode = $fsdb->fs;
256
257       Returns the field separator.  See "fscode" to get the "encoded"
258       version.
259
260   rscode
261           $rscode = $fsdb->rscode;
262
263       Returns the rscode of the given database.
264
265   ncols
266           @fields = $fsdb->ncols;
267
268       Return the number of columns.
269
270   cols
271           $fields_aref = $fsdb->cols;
272
273       Returns the column headings (the field names) of the open database as
274       an aref.
275
276   col_to_i
277           @fields = $fsdb->col_to_i($column_name);
278
279       Returns the column index (0-based) of a given $COLUMN_NAME.
280
281       Note: tests for existence of columns must use "defined", since the
282       index can be 0 which would be interpreted as false.
283
284   i_to_col
285           @fields = $fsdb->i_to_col($column_index);
286
287       Return the name of the COLUMN_INDEX-th (0-based) column.
288
289   fastpath_cancel
290           $fsdb->fastpath_cancel();
291
292       Discard any active fastpath code and allow fastpath-incompatible
293       operations.
294
295   codify
296           ($code, $has_last_refs) = $self->codify($underscored_pseudocode);
297
298       Convert db-code $UNDERSCORED_PSEUDOCODE into perl code in the context
299       of a given Fsdb stream.
300
301       We return a string of code $CODE that refs "@{$fref}" and "@{$lfref}"
302       for the current and prior row arrays, and a flag $HAS_LAST_REFS if
303       "@{$lfref}" is needed.  It is the callers job to set these up, probably
304       by evaling the returned string in the context of those variables.n
305
306       The conversion is a rename of all _foo's into database fields.  For
307       more perverse needs, _foo(N) means the Nth field after _foo.  Also, as
308       of 29-Jan-00, _last_foo gives the last row's value (_last_foo(N) is not
309       supported).  To convert we eval $codify_code.
310
311       20-Feb-07: _FROMFILE_foo opens the file called _foo and includes it in
312       place.
313
314       NEEDSWORK:  Should make some attempt to catch misspellings of column
315       names.
316
317   clean_potential_columns
318           @clean = Fsdb::IO::clean_potential_columns(@dirty);
319
320       Clean up user-provided column names.
321
322
323
324perl v5.34.0                      2021-07-22                       Fsdb::IO(3)
Impressum