1Fsdb::IO(3) User Contributed Perl Documentation Fsdb::IO(3)
2
3
4
6 Fsdb::IO - base class for Fsdb IO (FsdbReader and FsdbWriter)
7
9 There are several ways to do IO. We look at several that compute the
10 product of x and y for this input:
11
12 #fsdb x y product
13 1 10 -
14 2 20 -
15
16 The following routes go from most easy-to-use to least, and also from
17 least efficient to most. For IO-intensive work, if fastpath takes 1
18 unit of time, then using hashes or arrays takes approximately 2 units
19 of time, all due to CPU overhead.
20
21 Using A Hash
22 use Fsdb::IO::Reader;
23 use Fsdb::IO::Writer;
24
25 # preamble
26 my $out;
27 my $in = new Fsdb::IO::Reader(-file => '-', -comment_handler => \$out)
28 or die "cannot open stdin as fsdb\n";
29 $out = new Fsdb::IO::Writer(-file => '-', -clone => $in)
30 or die "cannot open stdin as fsdb\n";
31
32 # core starts here
33 my %hrow;
34 while ($in->read_row_to_href(\%hrow)) {
35 $hrow{product} = $hrow{x} * $hrow{y};
36 $out->write_row_from_href(\%hrow);
37 };
38
39 It can be convenient to use a hash because one can easily extract
40 fields using hash keys, but hashes can be slow.
41
42 Arrays Instead of Hashes
43 We can add a bit to end of the preamble:
44
45 my $x_i = $in->col_to_i('x') // die "no x column.\n";
46 my $y_i = $in->col_to_i('y') // die "no y column.\n";
47 my $product_i = $in->col_to_i('product') // die "no product column.\n";
48
49 And then replace the core with arrays:
50
51 my @arow;
52 while ($in->read_row_to_aref(\@arow)) {
53 $arow[$product_i] = $arow[$x_i] * $arow[$y_i];
54 $out->write_row_from_aref(\@arow);
55 };
56
57 This code has two advantages over hrefs: First, there is explicit error
58 checking for presence of the expected fields. Second, arrays are
59 likely a bit faster than hashes.
60
61 Objects Instead of Arrays
62 Keeping the same preamble as for arrays, we can directly get internal
63 Fsdb "row objects" with a new core:
64
65 # core
66 my $rowobj;
67 while ($rowobj = $in->read_rowobj) {
68 if (!ref($rowobj)) {
69 # comment
70 &{$in->{_comment_sub}}($rowobj);
71 next;
72 };
73 $rowobj->[$product_i] = $rowobj->[$x_i] * $rowobj->[$y_i];
74 $out->write_rowobj($rowobj);
75 };
76
77 This code is a bit faster because we just return the internal
78 representation (a rowobj), rather than copy into an array.
79
80 However, unfortunately it doesn't handle comment processing.
81
82 Fastpathing
83 To go really fast, we can build a custom thunk (a chunk of code) that
84 does exactly what we want. This approach is called a "fastpath".
85
86 It requires a bit more in the preamble (building on the array version):
87
88 my $in_fastpath_sub = $in->fastpath_sub();
89 my $out_fastpath_sub = $out->fastpath_sub();
90
91 And it allows a shorter core (modeled on rowobjs), since the fastpath
92 includes comment processing:
93
94 my $rowobj;
95 while ($rowobj = &$in_fastpath_sub) {
96 $rowobj->[$product_i] = $rowobj->[$x_i] * $rowobj->[$y_i];
97 &$out_fastpath_sub($rowobj);
98 };
99
100 This code is the fastest way to implement this block without evaling
101 code.
102
104 new
105 $fsdb = new Fsdb::IO;
106
107 Creates a new IO object. Usually you should not create a FsdbIO object
108 directly, but instead create a "FsdbReader" or "FsdbWriter".
109
110 Options:
111
112 -fh FILE_HANDLE Write IO to the given file handle.
113 -header HEADER_LINE Force the header to the given HEADER_LINE (should
114 be verbatim, including #h or whatever). =back
115 -fscode CODE Define just the column (or field) separator fscode part of
116 the header. See dbfilealter for a list of valid field separators.
117 -rscode CODE Define just the row separator part of the header. See
118 dbfilealter for a list of valid row separators.
119 -cols CODE Define just the columns of the header.
120 -compression CODE Define the compression mode for the file that will
121 take effect after the header.
122 -clone $fsdb Copy the stream's configuration from $FSDB, another
123 Fsdb::IO object.
124
125 _reset_cols
126 $fsdb->_reset_cols
127
128 Internal: zero all the mappings in the curren schema.
129
130 _find_filename_decompressor
131 returns the name of the decompression program for FILE if it ends in a
132 compression extension
133
134 config_one
135 $fsdb->config_one($arglist_aref);
136
137 Parse the first configuration option on the list, removing it.
138
139 Options are listed in new.
140
141 config
142 $fsdb->config(-arg1 => $value1, -arg2 => $value2);
143
144 Parse all options in the list.
145
146 default_binmode
147 $fsdb->default_binmode();
148
149 Set the file to the correct binmode, either given by "-encoding" at
150 setup, or defaulting from "LC_CTYPE" or "LANG".
151
152 If the file is compressed, we will reset binmode after reading the
153 header.
154
155 compare
156 $result = $fsdb->compare($other_fsdb)
157
158 Compares two Fsdb::IO objects, returning the strings "identical" (same
159 field separator, columns, and column order), or maybe "compatible"
160 (same field separator but different columns), or undef if they differ.
161
162 close
163 $fsdb->close;
164
165 Closes the file, frees open file handle, or sends an EOF signal (and
166 undef) down the open queue.
167
168 error
169 $fsdb->error;
170
171 Returns a descriptive string if there is an error, or undef if not.
172
173 The string will never end in a newline or punctuation.
174
175 update_v1_headerrow
176 internal: create the header the internal schema
177
178 parse_v1_headerrow
179 internal: interpet the header
180
181 update_headerrow
182 internal: create the header the internal schema
183
184 parse_headerrow
185 internal: interpet the v2 header. Format is:
186
187 #fsdb [-F x] [-R x] [-Z x] columns
188
189 All options must come first, start with dashes, and have an argument.
190 (More regular than the v1 header.)
191
192 parse_v1_fscode
193 internal
194
195 parse_fscode
196 Parse the field separator. See dbfilealter for a list of valid values.
197
198 parse_rscode
199 Internal: Interpret rscodes.
200
201 See dbfilealter for a list of valid values.
202
203 parse_compression
204 Internal: Interpret compression.
205
206 See dbfilealter for a list of valid values.
207
208 establish_new_col_mapping
209 internal
210
211 col_create
212 $fsdb->col_create($col_name)
213
214 Add a new column named $COL_NAME to the schema. Returns undef on
215 failure, or 1 if sucessful. (Note: does not return the column index on
216 creation because so that "or" can be used for error checking, given
217 that the column number could be zero.) Also, update the header row to
218 reflect this column (compare to "_internal_col_create").
219
220 _internal_col_create
221 $fsdb->_internal_col_create($col_name)
222
223 For internal "Fsdb::IO" use only. Create a new column $COL_NAME, just
224 like "col_create", but do not update the header row (as that function
225 does).
226
227 field_contains_fs
228 $boolean = $fsdb->field_contains_fs($field);
229
230 Determine if the $FIELD contains $FSDB's fscode (in which case it is
231 malformed).
232
233 fref_contains_fs
234 $boolean = $fsdb->fref_contains_fs($fref);
235
236 Determine if any field in $FREF contains $FSDB's fscode (in which case
237 it is malformed).
238
239 correct_fref_containing_fs
240 $boolean = $fsdb->correct_fref_containing_fs($fref);
241
242 Patch up any field in $FREF contains $FSDB's fscode, as best as
243 possible, but turning the field separator into underscores. Updates
244 $FREF in place, and returns if it was altered. This function looses
245 data.
246
247 fscode
248 $fscode = $fsdb->fscode;
249
250 Returns the fscode of the given database. (The encoded verison
251 representing the field separator.) See also fs to get the actual field
252 separator.
253
254 fs
255 $fscode = $fsdb->fs;
256
257 Returns the field separator. See "fscode" to get the "encoded"
258 version.
259
260 rscode
261 $rscode = $fsdb->rscode;
262
263 Returns the rscode of the given database.
264
265 ncols
266 @fields = $fsdb->ncols;
267
268 Return the number of columns.
269
270 cols
271 $fields_aref = $fsdb->cols;
272
273 Returns the column headings (the field names) of the open database as
274 an aref.
275
276 col_to_i
277 @fields = $fsdb->col_to_i($column_name);
278
279 Returns the column index (0-based) of a given $COLUMN_NAME.
280
281 Note: tests for existence of columns must use "defined", since the
282 index can be 0 which would be interpreted as false.
283
284 i_to_col
285 @fields = $fsdb->i_to_col($column_index);
286
287 Return the name of the COLUMN_INDEX-th (0-based) column.
288
289 fastpath_cancel
290 $fsdb->fastpath_cancel();
291
292 Discard any active fastpath code and allow fastpath-incompatible
293 operations.
294
295 codify
296 ($code, $has_last_refs) = $self->codify($underscored_pseudocode);
297
298 Convert db-code $UNDERSCORED_PSEUDOCODE into perl code in the context
299 of a given Fsdb stream.
300
301 We return a string of code $CODE that refs "@{$fref}" and "@{$lfref}"
302 for the current and prior row arrays, and a flag $HAS_LAST_REFS if
303 "@{$lfref}" is needed. It is the callers job to set these up, probably
304 by evaling the returned string in the context of those variables.n
305
306 The conversion is a rename of all _foo's into database fields. For
307 more perverse needs, _foo(N) means the Nth field after _foo. Also, as
308 of 29-Jan-00, _last_foo gives the last row's value (_last_foo(N) is not
309 supported). To convert we eval $codify_code.
310
311 20-Feb-07: _FROMFILE_foo opens the file called _foo and includes it in
312 place.
313
314 NEEDSWORK: Should make some attempt to catch misspellings of column
315 names.
316
317 clean_potential_columns
318 @clean = Fsdb::IO::clean_potential_columns(@dirty);
319
320 Clean up user-provided column names.
321
322
323
324perl v5.32.1 2021-01-27 Fsdb::IO(3)