1UUlib(3)              User Contributed Perl Documentation             UUlib(3)
2
3
4

NAME

6       Convert::UUlib - Perl interface to the uulib library (a.k.a.
7       uudeview/uuenview).
8

SYNOPSIS

10        use Convert::UUlib ':all';
11
12        # read all the files named on the commandline and decode them
13        # into the CURRENT directory. See below for a longer example.
14        LoadFile $_ for @ARGV;
15
16        for my $uu (GetFileList) {
17           if ($uu->state & FILE_OK) {
18             $uu->decode;
19             print $uu->filename, "\n";
20           }
21        }
22

DESCRIPTION

24       Read the file doc/library.pdf from the distribution for in-depth
25       information about the C-library used in this interface, and the rest of
26       this document and especially the non-trivial decoder program at the
27       end.
28

EXPORTED CONSTANTS

30   Action code constants
31         ACT_IDLE      we don't do anything
32         ACT_SCANNING  scanning an input file
33         ACT_DECODING  decoding into a temp file
34         ACT_COPYING   copying temp to target
35         ACT_ENCODING  encoding a file
36
37   Message severity levels
38         MSG_MESSAGE   just a message, nothing important
39         MSG_NOTE      something that should be noticed
40         MSG_WARNING   important msg, processing continues
41         MSG_ERROR     processing has been terminated
42         MSG_FATAL     decoder cannot process further requests
43         MSG_PANIC     recovery impossible, app must terminate
44
45   Options
46         OPT_VERSION   version number MAJOR.MINORplPATCH (ro)
47         OPT_FAST      assumes only one part per file
48         OPT_DUMBNESS  switch off the program's intelligence
49         OPT_BRACKPOL  give numbers in [] higher precendence
50         OPT_VERBOSE   generate informative messages
51         OPT_DESPERATE try to decode incomplete files
52         OPT_IGNREPLY  ignore RE:plies (off by default)
53         OPT_OVERWRITE whether it's OK to overwrite ex. files
54         OPT_SAVEPATH  prefix to save-files on disk
55         OPT_IGNMODE   ignore the original file mode
56         OPT_DEBUG     print messages with FILE/LINE info
57         OPT_ERRNO     get last error code for RET_IOERR (ro)
58         OPT_PROGRESS  retrieve progress information
59         OPT_USETEXT   handle text messages
60         OPT_PREAMB    handle Mime preambles/epilogues
61         OPT_TINYB64   detect short B64 outside of Mime
62         OPT_ENCEXT    extension for single-part encoded files
63         OPT_REMOVE    remove input files after decoding (dangerous)
64         OPT_MOREMIME  strict MIME adherence
65         OPT_DOTDOT    ".."-unescaping has not yet been done on input files
66         OPT_RBUF      set default read I/O buffer size in bytes
67         OPT_WBUF      set default write I/O buffer size in bytes
68         OPT_AUTOCHECK automatically check file list after every loadfile
69
70   Result/Error codes
71         RET_OK        everything went fine
72         RET_IOERR     I/O Error - examine errno
73         RET_NOMEM     not enough memory
74         RET_ILLVAL    illegal value for operation
75         RET_NODATA    decoder didn't find any data
76         RET_NOEND     encoded data wasn't ended properly
77         RET_UNSUP     unsupported function (encoding)
78         RET_EXISTS    file exists (decoding)
79         RET_CONT      continue -- special from ScanPart
80         RET_CANCEL    operation canceled
81
82   File States
83        This code is zero, i.e. "false":
84
85         UUFILE_READ   Read in, but not further processed
86
87        The following state codes are or'ed together:
88
89         FILE_MISPART  Missing Part(s) detected
90         FILE_NOBEGIN  No 'begin' found
91         FILE_NOEND    No 'end' found
92         FILE_NODATA   File does not contain valid uudata
93         FILE_OK       All Parts found, ready to decode
94         FILE_ERROR    Error while decoding
95         FILE_DECODED  Successfully decoded
96         FILE_TMPFILE  Temporary decoded file exists
97
98   Encoding types
99         UU_ENCODED    UUencoded data
100         B64_ENCODED   Mime-Base64 data
101         XX_ENCODED    XXencoded data
102         BH_ENCODED    Binhex encoded
103         PT_ENCODED    Plain-Text encoded (MIME)
104         QP_ENCODED    Quoted-Printable (MIME)
105         YENC_ENCODED  yEnc encoded (non-MIME)
106

EXPORTED FUNCTIONS

108   Initializing and cleanup
109       Initialize is automatically called when the module is loaded and
110       allocates quite a small amount of memory for todays machines ;) CleanUp
111       releases that again.
112
113       On my machine, a fairly complete decode with DBI backend needs about
114       10MB RSS to decode 20000 files.
115
116       CleanUp
117           Release memory, file items and clean up files. Should be called
118           after a decoidng run, if you want to start a new one.
119
120   Setting and querying options
121       $option = GetOption OPT_xxx
122       SetOption OPT_xxx, opt-value
123
124       See the "OPT_xxx" constants above to see which options exist.
125
126   Setting various callbacks
127       SetMsgCallback [callback-function]
128       SetBusyCallback [callback-function]
129       SetFileCallback [callback-function]
130       SetFNameFilter [callback-function]
131
132   Call the currently selected FNameFilter
133       $file = FNameFilter $file
134
135   Loading sourcefiles, optionally fuzzy merge and start decoding
136       ($retval, $count) = LoadFile $fname, [$id, [$delflag, [$partno]]]
137           Load the given file and scan it for encoded contents. Optionally
138           tag it with the given id, and if $delflag is true, delete the file
139           after it is no longer necessary. If you are certain of the part
140           number, you can specify it as the last argument.
141
142           A better (usually faster) way of doing this is using the
143           "SetFNameFilter" functionality.
144
145       $retval = Smerge $pass
146           If you are desperate, try to call "Smerge" with increasing $pass
147           values, beginning at 0, to try to merge parts that usually would
148           not have been merged.
149
150           Most probably this will result in garbled files, so never do this
151           by default, except:
152
153           If the "OPT_AUTOCHECK" option has been disabled (by default it is
154           enabled) to speed up file loading, then you have to call "Smerge
155           -1" after loading all files as an additional pre-pass (which is
156           normally done by "LoadFile").
157
158       $item = GetFileListItem $item_number
159           Return the $item structure for the $item_number'th found file, or
160           "undef" of no file with that number exists.
161
162           The first file has number 0, and the series has no holes, so you
163           can iterate over all files by starting with zero and incrementing
164           until you hit "undef".
165
166           This function has to walk the linear list of fils on each access,
167           so if you want to iterate over all items, it is usually faster to
168           use "GetFileList".
169
170       @items = GetFileList
171           Similar to "GetFileListItem", but returns all files in one go.
172
173   Decoding files
174       $retval = $item->rename ($newname)
175           Change the ondisk filename where the decoded file will be saved.
176
177       $retval = $item->decode_temp
178           Decode the file into a temporary location, use "$item->infile" to
179           retrieve the temporary filename.
180
181       $retval = $item->remove_temp
182           Remove the temporarily decoded file again.
183
184       $retval = $item->decode ([$target_path])
185           Decode the file to its destination, or the given target path.
186
187       $retval = $item->info (callback-function)
188
189   Querying (and setting) item attributes
190       $state    = $item->state
191       $mode     = $item->mode ([newmode])
192       $uudet    = $item->uudet
193       $size     = $item->size
194       $filename = $item->filename ([newfilename})
195       $subfname = $item->subfname
196       $mimeid   = $item->mimeid
197       $mimetype = $item->mimetype
198       $binfile  = $item->binfile
199
200   Information about source parts
201       $parts = $item->parts
202           Return information about all parts (source files) used to decode
203           the file as a list of hashrefs with the following structure:
204
205            {
206              partno   => <integer describing the part number, starting with 1>,
207              # the following member sonly exist when they contain useful information
208              sfname   => <local pathname of the file where this part is from>,
209              filename => <the ondisk filename of the decoded file>,
210              subfname => <used to cluster postings, possibly the posting filename>,
211              subject  => <the subject of the posting/mail>,
212              origin   => <the possible source (From) address>,
213              mimetype => <the possible mimetype of the decoded file>,
214              mimeid   => <the id part of the Content-Type>,
215            }
216
217           Usually you are interested mostly the "sfname" and possibly the
218           "partno" and "filename" members.
219
220   Functions below are not documented and not very well tested - feedback
221       welcome
222         QuickDecode
223         EncodeMulti
224         EncodePartial
225         EncodeToStream
226         EncodeToFile
227         E_PrepSingle
228         E_PrepPartial
229
230   EXTENSION FUNCTIONS
231       Functions found in this module but not documented in the uulib
232       documentation:
233
234       $msg = straction ACT_xxx
235           Return a human readable string representing the given action code.
236
237       $msg = strerror RET_xxx
238           Return a human readable string representing the given error code.
239
240       $str = strencoding xxx_ENCODED
241           Return the name of the encoding type as a string.
242
243       $str = strmsglevel MSG_xxx
244           Returns the message level as a string.
245
246       SetFileNameCallback $cb
247           Sets (or queries) the FileNameCallback, which is called whenever
248           the decoding library can't find a filename and wants to extract a
249           filename from the subject line of a posting. The callback will be
250           called with two arguments, the subject line and the current
251           candidate for the filename. The latter argument can be "undef",
252           which means that no filename could be found (and likely no one
253           exists, so it is safe to also return "undef" in this case). If it
254           doesn't return anything (not even "undef"!), then nothing happens,
255           so this is a no-op callback:
256
257              sub cb {
258                 return ();
259              }
260
261           If it returns "undef", then this indicates that no filename could
262           be found. In all other cases, the return value is taken to be the
263           filename.
264
265           This is a slightly more useful callback:
266
267             sub cb {
268                return unless $_[1]; # skip "Re:"-plies et al.
269                my ($subject, $filename) = @_;
270                # if we find some *.rar, take it
271                return $1 if $subject =~ /(\w+\.rar)/;
272                # otherwise just pass what we have
273                return ();
274             }
275

LARGE EXAMPLE DECODER

277       The general workflow for decoding is like this:
278
279       1. Configure options with "SetOption" or "SetXXXCallback".
280       2. Load all source files with "LoadFile".
281       3. Optionally "Smerge".
282       4. Iterate over all "GetFileList" items (i.e. result files).
283       5. "CleanUp" to delete files and free items.
284
285       What follows is the file "example-decoder" from the distribution that
286       illustrates the above worklfow in a non-trivial example.
287
288          #!/usr/bin/perl
289
290          # decode all the files in the directory uusrc/ and copy
291          # the resulting files to uudst/
292
293          use Convert::UUlib ':all';
294
295          sub namefilter {
296             my ($path) = @_;
297
298             $path=~s/^.*[\/\\]//;
299
300             $path
301          }
302
303          sub busycb {
304             my ($action, $curfile, $partno, $numparts, $percent, $fsize) = @_;
305             $_[0]=straction($action);
306             print "busy_callback(", (join ",",@_), ")\n";
307             0
308          }
309
310          SetOption OPT_RBUF, 128*1024;
311          SetOption OPT_WBUF, 1024*1024;
312          SetOption OPT_IGNMODE, 1;
313          SetOption OPT_IGNMODE, 1;
314          SetOption OPT_VERBOSE, 1;
315
316          # show the three ways you can set callback functions. I normally
317          # prefer the one with the sub inplace.
318          SetFNameFilter \&namefilter;
319
320          SetBusyCallback "busycb", 333;
321
322          SetMsgCallback sub {
323             my ($msg, $level) = @_;
324             print uc strmsglevel $_[1], ": $msg\n";
325          };
326
327          # the following non-trivial FileNameCallback takes care
328          # of some subject lines not detected properly by uulib:
329          SetFileNameCallback sub {
330             return unless $_[1]; # skip "Re:"-plies et al.
331             local $_ = $_[0];
332
333             # the following rules are rather effective on some newsgroups,
334             # like alt.binaries.games.anime, where non-mime, uuencoded data
335             # is very common
336
337             # if we find some *.rar, take it as the filename
338             return $1 if /(\S{3,}\.(?:[rstuvwxyz]\d\d|rar))\s/i;
339
340             # one common subject format
341             return $1 if /- "(.{2,}?\..+?)" (?:yenc )?\(\d+\/\d+\)/i;
342
343             # - filename.par (04/55)
344             return $1 if /- "?(\S{3,}\.\S+?)"? (?:yenc )?\(\d+\/\d+\)/i;
345
346             # - (xxx) No. 1 sayuri81.jpg 756565 bytes
347             # - (20 files) No.17 Roseanne.jpg [2/2]
348             return $1 if /No\.[ 0-9]+ (\S+\....) (?:\d+ bytes )?\[/;
349
350             # try to detect some common forms of filenames
351             return $1 if /([a-z0-9_\-+.]{3,}\.[a-z]{3,4}(?:.\d+))/i;
352
353             # otherwise just pass what we have
354             ()
355          };
356
357          # now read all files in the directory uusrc/*
358          for (<uusrc/*>) {
359             my ($retval, $count) = LoadFile ($_, $_, 1);
360             print "file($_), status(", strerror $retval, ") parts($count)\n";
361          }
362
363          SetOption OPT_SAVEPATH, "uudst/";
364
365          # now wade through all files and their source parts
366          for my $uu (GetFileList) {
367             print "file ", $uu->filename, "\n";
368             print " state ", $uu->state, "\n";
369             print " mode ", $uu->mode, "\n";
370             print " uudet ", strencoding $uu->uudet, "\n";
371             print " size ", $uu->size, "\n";
372             print " subfname ", $uu->subfname, "\n";
373             print " mimeid ", $uu->mimeid, "\n";
374             print " mimetype ", $uu->mimetype, "\n";
375
376             # print additional info about all parts
377             print " parts";
378             for ($uu->parts) {
379                for my $k (sort keys %$_) {
380                   print " $k=$_->{$k}";
381                }
382                print "\n";
383             }
384
385             $uu->remove_temp;
386
387             if (my $err = $uu->decode) {
388                print " ERROR ", strerror $err, "\n";
389             } else {
390                print " successfully saved as uudst/", $uu->filename, "\n";
391             }
392          }
393
394          print "cleanup...\n";
395
396          CleanUp;
397

PERLMULTICORE SUPPORT

399       This module supports the perlmulticore standard (see
400       <http://perlmulticore.schmorp.de/> for more info) for the following
401       functions - generally these are functions accessing the disk and/or
402       using considerable CPU time:
403
404          LoadFile
405          $item->decode
406          $item->decode_temp
407          $item->remove_temp
408          $item->info
409
410       The perl interpreter will be reacquired/released on every callback
411       invocation, so for performance reasons, callbacks should be avoided if
412       that is costly.
413
414       Future versions might enable multicore support for more functions.
415

BUGS AND LIMITATIONS

417       The original uulib library this module uses was written at a time where
418       main memory of measured in megabytes and buffer overflows as a security
419       thign didn't exist. While a lot of security fixes have been applied
420       over the years (includign some defense in depth mechanism that can
421       shield against a lot of as-of-yet undetected bugs), using this library
422       for security purposes requires care.
423
424       Likewise, file sizes when the uulib library was written were tiny
425       compared to today, so do not expect this library to handle files larger
426       than 2GB.
427
428       Lastly, this module uses a very "C-like" interface, which means it
429       doesn't protect you from invalid points as you might expect from "more
430       perlish" modules - for example, accessing a file item object after
431       callinbg "CleanUp" will likely result in crashes, memory corruption, or
432       worse.
433

AUTHOR

435       Marc Lehmann <schmorp@schmorp.de>, the original uulib library was
436       written by Frank Pilhofer <fp@informatik.uni-frankfurt.de>, and later
437       heavily bugfixed by Marc Lehmann.
438

SEE ALSO

440       perl(1), uudeview homepage at
441       <http://www.fpx.de/fp/Software/UUDeview/>.
442
443
444
445perl v5.30.2                      2020-04-26                          UUlib(3)
Impressum