PDL::IO::Misc(3pm)

1Misc(3)               User Contributed Perl Documentation              Misc(3)
2
3
4

NAME

6       PDL::IO::Misc - misc IO routines for PDL
7

DESCRIPTION

9       Some basic I/O functionality: FITS, tables, byte-swapping
10

SYNOPSIS

12        use PDL::IO::Misc;
13

FUNCTIONS

15   bswap2
16         Signature: (x(); )
17
18       Swaps pairs of bytes in argument x()
19
20       bswap2 does not process bad values.  It will set the bad-value flag of
21       all output ndarrays if the flag is set for any of the input ndarrays.
22
23   bswap4
24         Signature: (x(); )
25
26       Swaps quads of bytes in argument x()
27
28       bswap4 does not process bad values.  It will set the bad-value flag of
29       all output ndarrays if the flag is set for any of the input ndarrays.
30
31   bswap8
32         Signature: (x(); )
33
34       Swaps octets of bytes in argument x()
35
36       bswap8 does not process bad values.  It will set the bad-value flag of
37       all output ndarrays if the flag is set for any of the input ndarrays.
38
39   rcols
40       Read specified ASCII cols from a file into ndarrays and perl arrays
41       (also see "rgrep").
42
43         Usage:
44           ($x,$y,...) = rcols( *HANDLE|"filename", { EXCLUDE => '/^!/' }, $col1, $col2, ... )
45                    $x = rcols( *HANDLE|"filename", { EXCLUDE => '/^!/' }, [] )
46           ($x,$y,...) = rcols( *HANDLE|"filename", $col1, $col2, ..., { EXCLUDE => '/^!/' } )
47           ($x,$y,...) = rcols( *HANDLE|"filename", "/foo/", $col1, $col2, ... )
48
49       For each column number specified, a 1D output PDL will be generated.
50       Anonymous arrays of column numbers generate 2D output ndarrays with
51       dim0 for the column data and dim1 equal to the number of columns in the
52       anonymous array(s).
53
54       An empty anonymous array as column specification will produce a single
55       output data ndarray with dim(1) equal to the number of columns
56       available.
57
58       There are two calling conventions - the old version, where a pattern
59       can be specified after the filename/handle, and the new version where
60       options are given as as hash reference.  This reference can be given as
61       either the second or last argument.
62
63       The default behaviour is to ignore lines beginning with a # character
64       and lines that only consist of whitespace.  Options exist to only read
65       from lines that match, or do not match, supplied patterns, and to set
66       the types of the created ndarrays.
67
68       Can take file name or *HANDLE, and if no explicit column numbers are
69       specified, all are assumed. For the allowed types, see
70       "Datatype_conversions" in PDL::Core.
71
72       Options (case insensitive):
73
74         EXCLUDE or IGNORE
75         - ignore lines matching this pattern (default B<'/^#/'>).
76
77         INCLUDE or KEEP
78         - only use lines which match this pattern (default B<''>).
79
80         LINES
81         - a string pattern specifying which line numbers to use.
82         Line numbers start at 0 and the syntax is 'a:b:c' to use
83         every c'th matching line between a and b (default B<''>).
84
85         DEFTYPE
86         - default data type for stored data (if not specified, use the type
87         stored in C<$PDL::IO::Misc::deftype>, which starts off as B<double>).
88
89         TYPES
90         - reference to an array of data types, one element for each column
91         to be read in.  Any missing columns use the DEFTYPE value (default B<[]>).
92
93         COLSEP
94         - splits on this string/pattern/qr{} between columns of data. Defaults to
95         $PDL::IO::Misc::defcolsep.
96
97         PERLCOLS
98         - an array of column numbers which are to be read into perl arrays
99         rather than ndarrays.  Any columns not specified in the explicit list
100         of columns to read will be returned after the explicit columns.
101         (default B<undef>).
102
103         COLIDS
104         - if defined to an array reference, it will be assigned the column
105         ID values obtained by splitting the first line of the file in the
106         identical fashion to the column data.
107
108         CHUNKSIZE
109         - the number of input data elements to batch together before appending
110         to each output data ndarray (Default value is 100).  If CHUNKSIZE is
111         greater than the number of lines of data to read, the entire file is
112         slurped in, lines split, and perl lists of column data are generated.
113         At the end, effectively pdl(@column_data) produces any result ndarrays.
114
115         VERBOSE
116         - be verbose about IO processing (default C<$PDL::vebose>)
117
118       For example:
119
120         $x      = PDL->rcols 'file1';         # file1 has only one column of data
121         $x      = PDL->rcols 'file2', [];     # file2 can have multiple columns, still 1 ndarray output
122                                               # (empty array ref spec means all possible data fields)
123
124         ($x,$y) = rcols 'table.csv', { COLSEP => ',' };  # read CSV data file
125         ($x,$y) = rcols *STDOUT;  # default separator for lines like '32 24'
126
127         # read in lines containing the string foo, where the first
128         # example also ignores lines that begin with a # character.
129         ($x,$y,$z) = rcols 'file2', 0,4,5, { INCLUDE => '/foo/' };
130         ($x,$y,$z) = rcols 'file2', 0,4,5, { INCLUDE => '/foo/', EXCLUDE => '' };
131
132         # ignore the first 27 lines of the file, reading in as ushort's
133         ($x,$y) = rcols 'file3', { LINES => '27:-1', DEFTYPE => ushort };
134         ($x,$y) = rcols 'file3', { LINES => '27:', TYPES => [ ushort, ushort ] };
135
136         # read in the first column as a perl array and the next two as ndarrays
137         # with the perl column returned after the ndarray outputs
138         ($x,$y,$name) = rcols 'file4', 1, 2   , { PERLCOLS => [ 0 ] };
139         printf "Number of names read in = %d\n", 1 + $#$name;
140
141         # read in the first column as a perl array and the next two as ndarrays
142         # with PERLCOLS changing the type of the first returned value to perl list ref
143         ($name,$x,$y) = rcols 'file4', 0, 1, 2, { PERLCOLS => [ 0 ] };
144
145         # read in the first column as a perl array returned first followed by the
146         # the next two data columns in the file as a single Nx2 ndarray
147         ($name,$xy) = rcols 'file4', 0, [1, 2], { PERLCOLS => [ 0 ] };
148
149         NOTES:
150
151         1. Quotes are required on patterns or use the qr{} quote regexp syntax.
152
153         2. Columns are separated by whitespace by default, use the COLSEP option
154            separator to specify an alternate split pattern or string or specify an
155            alternate default separator by setting C<$PDL::IO::Misc::defcolsep> .
156
157         3. Legacy support is present to use C<$PDL::IO::Misc::colsep> to set the
158            column separator but C<$PDL::IO::Misc::colsep> is not defined by default.
159            If you set the variable to a defined value it will get picked up.
160
161         4. LINES => '-1:0:3' may not work as you expect, since lines are skipped
162            when read in, then the whole array reversed.
163
164         5. For consistency with wcols and rcols 1D usage, column data is loaded
165            into the rows of the pdls (i.e., dim(0) is the elements read per column
166            in the file and dim(1) is the number of columns of data read.
167
168   wcols
169         Write ASCII columns into file from 1D or 2D ndarrays and/or 1D listrefs efficiently.
170
171       Can take file name or *HANDLE, and if no file/filehandle is given
172       defaults to STDOUT.
173
174         Options (case insensitive):
175
176           HEADER - prints this string before the data. If the string
177                    is not terminated by a newline, one is added. (default B<''>).
178
179           COLSEP - prints this string between columns of data. Defaults to
180                    $PDL::IO::Misc::defcolsep.
181
182           FORMAT - A printf-style format string that is cycled through
183                    column output for user controlled formatting.
184
185        Usage: wcols $data1, $data2, $data3,..., *HANDLE|"outfile", [\%options];  # or
186               wcols $format_string, $data1, $data2, $data3,..., *HANDLE|"outfile", [\%options];
187
188          where the $dataN args are either 1D ndarrays, 1D perl array refs,
189          or 2D ndarrays (as might be returned from rcols() with the [] column
190          syntax and/or using the PERLCOLS option).  dim(0) of all ndarrays
191          written must be the same size.  The printf-style $format_string,
192          if given, overrides any FORMAT key settings in the option hash.
193
194       e.g.,
195
196         $x = random(4); $y = ones(4);
197         wcols $x, $y+2, 'foo.dat';
198         wcols $x, $y+2, *STDERR;
199         wcols $x, $y+2, '|wc';
200
201         $x = sequence(3); $y = zeros(3); $c = random(3);
202         wcols $x,$y,$c; # Orthogonal version of 'print $x,$y,$c' :-)
203
204         wcols "%10.3f", $x,$y; # Formatted
205         wcols "%10.3f %10.5g", $x,$y; # Individual column formatting
206
207         $x = sequence(3); $y = zeros(3); $units = [ 'm/sec', 'kg', 'MPH' ];
208         wcols $x,$y, { HEADER => "#   x   y" };
209         wcols $x,$y, { Header => "#   x   y", Colsep => ', ' };  # case insensitive option names!
210         wcols " %4.1f  %4.1f  %s",$x,$y,$units, { header => "# Day  Time  Units" };
211
212         $a52 = sequence(5,2); $y = ones(5); $c = [ 1, 2, 4 ];
213         wcols $a52;         # now can write out 2D pdls (2 columns data in output)
214         wcols $y, $a52, $c  # ...and mix and match with 1D listrefs as well
215
216         NOTES:
217
218         1. Columns are separated by whitespace by default, use
219            C<$PDL::IO::Misc::defcolsep> to modify the default value or
220            the COLSEP option
221
222         2. Support for the C<$PDL::IO::Misc::colsep> global value
223            of PDL-2.4.6 and earlier is maintained but the initial value
224            of the global is undef until you set it.  The value will be
225            then be picked up and used as if defcolsep were specified.
226
227         3. Dim 0 corresponds to the column data dimension for both
228            rcols and wcols.  This makes wcols the reverse operation
229            of rcols.
230
231   swcols
232       generate string list from "sprintf" format specifier and a list of
233       ndarrays
234
235       "swcols" takes an (optional) format specifier of the printf sort and a
236       list of 1D ndarrays as input. It returns a perl array (or array
237       reference if called in scalar context) where each element of the array
238       is the string generated by printing the corresponding element of the
239       ndarray(s) using the format specified. If no format is specified it
240       uses the default print format.
241
242        Usage: @str = swcols format, pdl1,pdl2,pdl3,...;
243           or  $str = swcols format, pdl1,pdl2,pdl3,...;
244
245   rgrep
246         Read columns into ndarrays using full regexp pattern matching.
247
248
249         Options:
250
251         UNDEFINED: This option determines what will be done for undefined
252         values. For instance when reading a comma-separated file of the type
253         C<1,2,,4> where the C<,,> indicates a missing value.
254
255         The default value is to assign C<$PDL::undefval> to undefined values,
256         but if C<UNDEFINED> is set this is used instead. This would normally
257         be set to a number, but if it is set to C<Bad> and PDL is compiled
258         with Badvalue support (see L<PDL::Bad/>) then undefined values are set to
259         the appropriate badvalue and the column is marked as bad.
260
261         DEFTYPE: Sets the default type of the columns - see the documentation for
262          L</rcols()>
263
264         TYPES:   A reference to a Perl array with types for each column - see
265         the documentation for L</rcols()>
266
267         BUFFERSIZE: The number of lines to extend the ndarray by. It might speed
268         up the reading a little bit by setting this to the number of lines in the
269         file, but in general L</rasc()> is a better choice
270
271       Usage
272
273        ($x,$y,...) = rgrep(sub, *HANDLE|"filename")
274
275       e.g.
276
277        ($x,$y) = rgrep {/Foo (.*) Bar (.*) Mumble/} $file;
278
279       i.e. the vectors $x and $y get the progressive values of $1, $2 etc.
280
281   isbigendian
282         Determine endianness of machine - returns 0 or 1 accordingly
283
284   rasc
285         Simple function to slurp in ASCII numbers quite quickly,
286         although error handling is marginal (to nonexistent).
287
288         $pdl->rasc("filename"|FILEHANDLE [,$noElements]);
289
290             Where:
291               filename is the name of the ASCII file to read or open file handle
292               $noElements is the optional number of elements in the file to read.
293                   (If not present, all of the file will be read to fill up $pdl).
294               $pdl can be of type float or double (for more precision).
295
296         #  (test.num is an ascii file with 20 numbers. One number per line.)
297         $in = PDL->null;
298         $num = 20;
299         $in->rasc('test.num',20);
300         $imm = zeroes(float,20,2);
301         $imm->rasc('test.num');
302
303   rcube
304        Read list of files directly into a large data cube (for efficiency)
305
306        $cube = rcube \&reader_function, @files;
307
308        $cube = rcube \&rfits, glob("*.fits");
309
310       This IO function allows direct reading of files into a large data cube,
311       Obviously one could use cat() but this is more memory efficient.
312
313       The reading function (e.g. rfits, readfraw) (passed as a reference) and
314       files are the arguments.
315
316       The cube is created as the same X,Y dims and datatype as the first
317       image specified. The Z dim is simply the number of images.
318

AUTHOR

320       Copyright (C) Karl Glazebrook 1997, Craig DeForest 2001, 2003, and
321       Chris Marshall 2010. All rights reserved. There is no warranty. You are
322       allowed to redistribute this software / documentation under certain
323       conditions. For details, see the file COPYING in the PDL distribution.
324       If this file is separated from the PDL distribution, the copyright
325       notice should be included in the file.
326
327
328
329perl v5.38.0                      2023-07-21                           Misc(3)