1MIME::Parser::Filer(3)User Contributed Perl DocumentationMIME::Parser::Filer(3)
2
3
4

NAME

6       MIME::Parser::Filer - manage file-output of the parser
7

SYNOPSIS

9       Before reading further, you should see MIME::Parser to make sure that
10       you understand where this module fits into the grand scheme of things.
11       Go on, do it now.  I'll wait.
12
13       Ready?  Ok... now read "DESCRIPTION" below, and everything else should
14       make sense.
15
16       Public interface
17
18           ### Create a "filer" of the desired class:
19           my $filer = MIME::Parser::FileInto->new($dir);
20           my $filer = MIME::Parser::FileUnder->new($basedir);
21           ...
22
23           ### Want added security?  Don't let outsiders name your files:
24           $filer->ignore_filename(1);
25
26           ### Prepare for the parsing of a new top-level message:
27           $filer->init_parse;
28
29           ### Return the path where this message's data should be placed:
30           $path = $filer->output_path($head);
31
32       Semi-public interface
33
34       These methods might be overriden or ignored in some subclasses, so they
35       don't all make sense in all circumstances:
36
37           ### Tweak the mapping from content-type to extension:
38           $emap = $filer->output_extension_map;
39           $emap->{"text/html"} = ".htm";
40

DESCRIPTION

42       How this class is used when parsing
43
44       When a MIME::Parser decides that it wants to output a file to disk, it
45       uses its "Filer" object -- an instance of a MIME::Parser::Filer sub‐
46       class -- to determine where to put the file.
47
48       Every parser has a single Filer object, which it uses for all parsing.
49       You can get the Filer for a given $parser like this:
50
51           $filer = $parser->filer;
52
53       At the beginning of each "parse()", the filer's internal state is reset
54       by the parser:
55
56           $parser->filer->init_parse;
57
58       The parser can then get a path for each entity in the message by hand‐
59       ing that entity's header (a MIME::Head) to the filer and having it do
60       the work, like this:
61
62           $new_file = $parser->filer->output_path($head);
63
64       Since it's nice to be able to clean up after a parse (especially a
65       failed parse), the parser tells the filer when it has actually used a
66       path:
67
68           $parser->filer->purgeable($new_file);
69
70       Then, if you want to clean up the files which were created for a par‐
71       ticular parse (and also any directories that the Filer created), you
72       would do this:
73
74           $parser->filer->purge;
75
76       Writing your own subclasses
77
78       There are two standard "Filer" subclasses (see below):
79       MIME::Parser::FileInto, which throws all files from all parses into the
80       same directory, and MIME::Parser::FileUnder (preferred), which creates
81       a subdirectory for each message.  Hopefully, these will be sufficient
82       for most uses, but just in case...
83
84       The only method you have to override is output_path():
85
86           $filer->output_path($head);
87
88       This method is invoked by MIME::Parser when it wants to put a decoded
89       message body in an output file.  The method should return a path to the
90       file to create.  Failure is indicated by throwing an exception.
91
92       The path returned by "output_path()" should be "ready for open()": any
93       necessary parent directories need to exist at that point.  These direc‐
94       tories can be created by the Filer, if course, and they should be
95       marked as purgeable() if a purge should delete them.
96
97       Actually, if your issue is more where the files go than what they're
98       named, you can use the default output_path() method and just override
99       one of its components:
100
101           $dir  = $filer->output_dir($head);
102           $name = $filer->output_filename($head);
103           ...
104

PUBLIC INTERFACE

106       MIME::Parser::Filer
107
108       This is the abstract superclass of all "filer" objects.
109
110       new INITARGS...
111           Class method, constructor.  Create a new outputter for the given
112           parser.  Any subsequent arguments are given to init(), which sub‐
113           classes should override for their own use (the default init does
114           nothing).
115
116       results RESULTS
117           Instance method.  Link this filer to a MIME::Parser::Results object
118           which will tally the messages.  Notice that we avoid linking it to
119           the parser to avoid circular reference!
120
121       init_parse
122           Instance method.  Prepare to start parsing a new message.  Sub‐
123           classes should always be sure to invoke the inherited method.
124
125       evil_filename FILENAME
126           Instance method.  Is this an evil filename; i.e., one which should
127           not be used in generating a disk file name?  It is if any of these
128           are true:
129
130               * it is empty
131               * it is a string of dots: ".", "..", etc.
132               * it contains characters not in the set: "A" - "Z", "a" - "z",
133                 "0" - "9", "-", "_", "+", "=", ".", ",", "@", "#",
134                 "$", and " ".
135               * it is too long
136
137           If you just want to change this behavior, you should override this
138           method in the subclass of MIME::Parser::Filer that you use.
139
140           Warning: at the time this method is invoked, the FILENAME has
141           already been unmime'd into the local character set.  If you're
142           using any character set other than ASCII, ISO-8859-*, or UTF-8, the
143           interpretation of the "path" characters might be very different,
144           and you will probably need to override this method.  See "unmime"
145           in MIME::WordDecoder for more details.
146
147           Note: subclasses of MIME::Parser::Filer which override out‐
148           put_path() might not consult this method; note, however, that the
149           built-in subclasses do consult it.
150
151           Thanks to Andrew Pimlott for finding a real dumb bug in the origi‐
152           nal version.  Thanks to Nickolay Saukh for noting that evil is in
153           the eye of the beholder.
154
155       exorcise_filename FILENAME
156           Instance method.  If a given filename is evil (see "evil_filename")
157           we try to rescue it by performing some basic operations: shortening
158           it, removing bad characters, etc., and checking each against
159           evil_filename().
160
161           Returns the exorcised filename (which is guaranteed to not be
162           evil), or undef if it could not be salvaged.
163
164           Warning: at the time this method is invoked, the FILENAME has
165           already been unmime'd into the local character set.  If you're
166           using anything character set other than ASCII, ISO-8859-*, or
167           UTF-8, the interpretation of the "path" characters might be very
168           very different, and you will probably need to override this method.
169           See "unmime" in MIME::WordDecoder for more details.
170
171       find_unused_path DIR, FILENAME
172           Instance method, subclasses only.  We have decided on an output
173           directory and tentative filename, but there is a chance that it
174           might already exist.  Keep adding a numeric suffix "-1", "-2", etc.
175           to the filename until an unused path is found, and then return that
176           path.
177
178           The suffix is actually added before the first "." in the filename
179           is there is one; for example:
180
181               picture.gif       archive.tar.gz      readme
182               picture-1.gif     archive-1.tar.gz    readme-1
183               picture-2.gif     archive-2.tar.gz    readme-2
184               ...               ...                 ...
185               picture-10.gif
186               ...
187
188           This can be a costly operation, and risky if you don't want files
189           renamed, so it is in your best interest to minimize situations
190           where these kinds of collisions occur.  Unfortunately, if a multi‐
191           part message gives all of its parts the same recommended filename,
192           and you are placing them all in the same directory, this method
193           might be unavoidable.
194
195       ignore_filename [YESNO]
196           Instance method.  Return true if we should always ignore recom‐
197           mended filenames in messages, choosing instead to always generate
198           our own filenames.  With argument, sets this value.
199
200           Note: subclasses of MIME::Parser::Filer which override out‐
201           put_path() might not honor this setting; note, however, that the
202           built-in subclasses honor it.
203
204       output_dir HEAD
205           Instance method.  Return the output directory for the given header.
206           The default method returns ".".
207
208       output_filename HEAD
209           Instance method, subclasses only.  A given recommended filename was
210           either not given, or it was judged to be evil.  Return a fake name,
211           possibly using information in the message HEADer.  Note that this
212           is just the filename, not the full path.
213
214           Used by output_path().  If you're using the default "out‐
215           put_path()", you probably don't need to worry about avoiding colli‐
216           sions with existing files; we take care of that in
217           find_unused_path().
218
219       output_prefix [PREFIX]
220           Instance method.  Get the short string that all filenames for
221           extracted body-parts will begin with (assuming that there is no
222           better "recommended filename").  The default is "msg".
223
224           If PREFIX is not given, the current output prefix is returned.  If
225           PREFIX is given, the output prefix is set to the new value, and the
226           previous value is returned.
227
228           Used by output_filename().
229
230           Note: subclasses of MIME::Parser::Filer which override out‐
231           put_path() or output_filename() might not honor this setting; note,
232           however, that the built-in subclasses honor it.
233
234       output_type_ext
235           Instance method.  Return a reference to the hash used by the
236           default output_filename() for mapping from content-types to exten‐
237           sions when there is no default extension to use.
238
239               $emap = $filer->output_typemap;
240               $emap->{'text/plain'} = '.txt';
241               $emap->{'text/html'}  = '.html';
242               $emap->{'text/*'}     = '.txt';
243               $emap->{'*/*'}        = '.dat';
244
245           Note: subclasses of MIME::Parser::Filer which override out‐
246           put_path() or output_filename() might not consult this hash; note,
247           however, that the built-in subclasses consult it.
248
249       output_path HEAD
250           Instance method, subclasses only.  Given a MIME head for a file to
251           be extracted, come up with a good output pathname for the extracted
252           file.  This is the only method you need to worry about if you are
253           building a custom filer.
254
255           The default implementation does a lot of work; subclass imple‐
256           menters really should try to just override its components instead
257           of the whole thing.  It works basically as follows:
258
259               $directory = $self->output_dir($head);
260
261               $filename = $head->recommended_filename();
262               if (!$filename or
263                    $self->ignore_filename() or
264                    $self->evil_filename($filename)) {
265                   $filename = $self->output_filename($head);
266               }
267
268               return $self->find_unused_path($directory, $filename);
269
270           Note: There are many, many, many ways you might want to control the
271           naming of files, based on your application.  If you don't like the
272           behavior of this function, you can easily define your own subclass
273           of MIME::Parser::Filer and override it there.
274
275           Note: Nickolay Saukh pointed out that, given the subjective nature
276           of what is "evil", this function really shouldn't warn about an
277           evil filename, but maybe just issue a debug message.  I considered
278           that, but then I thought: if debugging were off, people wouldn't
279           know why (or even if) a given filename had been ignored.  In mail
280           robots that depend on externally-provided filenames, this could
281           cause hard-to-diagnose problems.  So, the message is still a warn‐
282           ing.
283
284           Thanks to Laurent Amon for pointing out problems with the original
285           implementation, and for making some good suggestions.  Thanks also
286           to Achim Bohnet for pointing out that there should be a hookless,
287           OO way of overriding the output path.
288
289       purge
290           Instance method, final.  Purge all files/directories created by the
291           last parse.  This method simply goes through the purgeable list in
292           reverse order (see "purgeable") and removes all existing
293           files/directories in it.  You should not need to override this
294           method.
295
296       purgeable [FILE]
297           Instance method, final.  Add FILE to the list of "purgeable"
298           files/directories (those which will be removed if you do a
299           "purge()").  You should not need to override this method.
300
301           If FILE is not given, the "purgeable" list is returned.  This may
302           be used for more-sophisticated purging.
303
304           As a special case, invoking this method with a FILE that is an
305           arrayref will replace the purgeable list with a copy of the array's
306           contents, so [] may be used to clear the list.
307
308           Note that the "purgeable" list is cleared when a parser begins a
309           new parse; therefore, if you want to use purge() to do cleanup, you
310           must do so before starting a new parse!
311
312       MIME::Parser::FileInto
313
314       This concrete subclass of MIME::Parser::Filer supports filing into a
315       given directory.
316
317       init DIRECTORY
318           Instance method, initiallizer.  Set the directory where all files
319           will go.
320
321       MIME::Parser::FileUnder
322
323       This concrete subclass of MIME::Parser::Filer supports filing under a
324       given directory, using one subdirectory per message, but with all mes‐
325       sage parts in the same directory.
326
327       init BASEDIR, OPTSHASH...
328           Instance method, initiallizer.  Set the base directory which will
329           contain the message directories.  If used, then each parse of
330           begins by creating a new subdirectory of BASEDIR where the actual
331           parts of the message are placed.  OPTSHASH can contain the follow‐
332           ing:
333
334           DirName
335               Explicitly set the name of the subdirectory which is created.
336               The default is to use the time, process id, and a sequence num‐
337               ber, but you might want a predictable directory.
338
339           Purge
340               Automatically purge the contents of the directory (including
341               all subdirectories) before each parse.  This is really only
342               needed if using an explicit DirName, and is provided as a con‐
343               venience only.  Currently we use the 1-arg form of
344               File::Path::rmtree; you should familiarize yourself with the
345               caveats therein.
346
347           The output_dir() will return the path to this message-specific
348           directory until the next parse is begun, so you can do this:
349
350               use File::Path;
351
352               $parser->output_under("/tmp");
353               $ent = eval { $parser->parse_open($msg); };   ### parse
354               if (!$ent) {         ### parse failed
355                   rmtree($parser->output_dir);
356                   die "parse failed: $@";
357               }
358               else {               ### parse succeeded
359                   ...do stuff...
360               }
361

AUTHOR

363       Eryq (eryq@zeegee.com), ZeeGee Software Inc (http://www.zeegee.com).
364
365       All rights reserved.  This program is free software; you can redis‐
366       tribute it and/or modify it under the same terms as Perl itself.
367

VERSION

369       $Revision: 1.6 $
370
371
372
373perl v5.8.8                       2006-03-17            MIME::Parser::Filer(3)
Impressum