1MIME::Parser(3)       User Contributed Perl Documentation      MIME::Parser(3)
2
3
4

NAME

6       MIME::Parser - experimental class for parsing MIME streams
7

SYNOPSIS

9       Before reading further, you should see MIME::Tools to make sure that
10       you understand where this module fits into the grand scheme of things.
11       Go on, do it now.  I'll wait.
12
13       Ready?  Ok...
14
15       Basic usage examples
16
17           ### Create a new parser object:
18           my $parser = new MIME::Parser;
19
20           ### Tell it where to put things:
21           $parser->output_under("/tmp");
22
23           ### Parse an input filehandle:
24           $entity = $parser->parse(\*STDIN);
25
26           ### Congratulations: you now have a (possibly multipart) MIME entity!
27           $entity->dump_skeleton;          # for debugging
28
29       Examples of input
30
31           ### Parse from filehandles:
32           $entity = $parser->parse(\*STDIN);
33           $entity = $parser->parse(IO::File->new("some command⎪");
34
35           ### Parse from any object that supports getline() and read():
36           $entity = $parser->parse($myHandle);
37
38           ### Parse an in-core MIME message:
39           $entity = $parser->parse_data($message);
40
41           ### Parse an MIME message in a file:
42           $entity = $parser->parse_open("/some/file.msg");
43
44           ### Parse an MIME message out of a pipeline:
45           $entity = $parser->parse_open("gunzip - < file.msg.gz ⎪");
46
47           ### Parse already-split input (as "deliver" would give it to you):
48           $entity = $parser->parse_two("msg.head", "msg.body");
49
50       Examples of output control
51
52           ### Keep parsed message bodies in core (default outputs to disk):
53           $parser->output_to_core(1);
54
55           ### Output each message body to a one-per-message directory:
56           $parser->output_under("/tmp");
57
58           ### Output each message body to the same directory:
59           $parser->output_dir("/tmp");
60
61           ### Change how nameless message-component files are named:
62           $parser->output_prefix("msg");
63
64       Examples of error recovery
65
66           ### Normal mechanism:
67           eval { $entity = $parser->parse(\*STDIN) };
68           if ($@) {
69               $results  = $parser->results;
70               $decapitated = $parser->last_head;  ### get last top-level head
71           }
72
73           ### Ultra-tolerant mechanism:
74           $parser->ignore_errors(1);
75           $entity = eval { $parser->parse(\*STDIN) };
76           $error = ($@ ⎪⎪ $parser->last_error);
77
78           ### Cleanup all files created by the parse:
79           eval { $entity = $parser->parse(\*STDIN) };
80           ...
81           $parser->filer->purge;
82
83       Examples of parser options
84
85           ### Automatically attempt to RFC-1522-decode the MIME headers?
86           $parser->decode_headers(1);             ### default is false
87
88           ### Parse contained "message/rfc822" objects as nested MIME streams?
89           $parser->extract_nested_messages(0);    ### default is true
90
91           ### Look for uuencode in "text" messages, and extract it?
92           $parser->extract_uuencode(1);           ### default is false
93
94           ### Should we forgive normally-fatal errors?
95           $parser->ignore_errors(0);              ### default is true
96
97       Miscellaneous examples
98
99           ### Convert a Mail::Internet object to a MIME::Entity:
100           @lines = (@{$mail->header}, "\n", @{$mail->body});
101           $entity = $parser->parse_data(\@lines);
102

DESCRIPTION

104       You can inherit from this class to create your own subclasses that
105       parse MIME streams into MIME::Entity objects.
106

PUBLIC INTERFACE

108       Construction
109
110       new ARGS...
111           Class method.  Create a new parser object.  Once you do this, you
112           can then set up various parameters before doing the actual parsing.
113           For example:
114
115               my $parser = new MIME::Parser;
116               $parser->output_dir("/tmp");
117               $parser->output_prefix("msg1");
118               my $entity = $parser->parse(\*STDIN);
119
120           Any arguments are passed into "init()".  Don't override this in
121           your subclasses; override init() instead.
122
123       init ARGS...
124           Instance method.  Initiallize a new MIME::Parser object.  This is
125           automatically sent to a new object; you may want to override it.
126           If you override this, be sure to invoke the inherited method.
127
128       init_parse
129           Instance method.  Invoked automatically whenever one of the top-
130           level parse() methods is called, to reset the parser to a "ready"
131           state.
132
133       Altering how messages are parsed
134
135       decode_headers [YESNO]
136           Instance method.  Controls whether the parser will attempt to
137           decode all the MIME headers (as per RFC-1522) the moment it sees
138           them.  This is not advisable for two very important reasons:
139
140           *   It screws up the extraction of information from MIME fields.
141               If you fully decode the headers into bytes, you can inadver‐
142               tently transform a parseable MIME header like this:
143
144                   Content-type: text/plain; filename="=?ISO-8859-1?Q?Hi=22Ho?="
145
146               into unparseable gobbledygook; in this case:
147
148                   Content-type: text/plain; filename="Hi"Ho"
149
150           *   It is information-lossy.  An encoded string which contains both
151               Latin-1 and Cyrillic characters will be turned into a binary
152               mishmosh which simply can't be rendered.
153
154           History.  This method was once the only out-of-the-box way to deal
155           with attachments whose filenames had non-ASCII characters.  How‐
156           ever, since MIME-tools 5.4xx this is no longer necessary.
157
158           Parameters.  If YESNO is true, decoding is done.  However, you will
159           get a warning unless you use one of the special "true" values:
160
161              "I_NEED_TO_FIX_THIS"
162                     Just shut up and do it.  Not recommended.
163                     Provided only for those who need to keep old scripts functioning.
164
165              "I_KNOW_WHAT_I_AM_DOING"
166                     Just shut up and do it.  Not recommended.
167                     Provided for those who REALLY know what they are doing.
168
169           If YESNO is false (the default), no attempt at decoding will be
170           done.  With no argument, just returns the current setting.  Remem‐
171           ber: you can always decode the headers after the parsing has com‐
172           pleted (see MIME::Head::decode()), or decode the words on demand
173           (see MIME::Words).
174
175       extract_nested_messages OPTION
176           Instance method.  Some MIME messages will contain a part of type
177           "message/rfc822" ,"message/partial" or "message/external-body":
178           literally, the text of an embedded mail/news/whatever message.
179           This option controls whether (and how) we parse that embedded mes‐
180           sage.
181
182           If the OPTION is false, we treat such a message just as if it were
183           a "text/plain" document, without attempting to decode its contents.
184
185           If the OPTION is true (the default), the body of the "mes‐
186           sage/rfc822" or "message/partial" part is parsed by this parser,
187           creating an entity object.  What happens then is determined by the
188           actual OPTION:
189
190           NEST or 1
191               The default setting.  The contained message becomes the sole
192               "part" of the "message/rfc822" entity (as if the containing
193               message were a special kind of "multipart" message).  You can
194               recover the sub-entity by invoking the parts() method on the
195               "message/rfc822" entity.
196
197           REPLACE
198               The contained message replaces the "message/rfc822" entity, as
199               though the "message/rfc822" "container" never existed.
200
201               Warning: notice that, with this option, all the header informa‐
202               tion in the "message/rfc822" header is lost.  This might seri‐
203               ously bother you if you're dealing with a top-level message,
204               and you've just lost the sender's address and the subject line.
205               ":-/".
206
207           Thanks to Andreas Koenig for suggesting this method.
208
209       extract_uuencode [YESNO]
210           Instance method.  If set true, then whenever we are confronted with
211           a message whose effective content-type is "text/plain" and whose
212           encoding is 7bit/8bit/binary, we scan the encoded body to see if it
213           contains uuencoded data (generally given away by a "begin XXX"
214           line).
215
216           If it does, we explode the uuencoded message into a multipart,
217           where the text before the first "begin XXX" becomes the first part,
218           and all "begin...end" sections following become the subsequent
219           parts.  The filename (if given) is accessible through the normal
220           means.
221
222       ignore_errors [YESNO]
223           Instance method.  Controls whether the parser will attempt to
224           ignore normally-fatal errors, treating them as warnings and contin‐
225           uing with the parse.
226
227           If YESNO is true (the default), many syntax errors are tolerated.
228           If YESNO is false, fatal errors throw exceptions.  With no argu‐
229           ment, just returns the current setting.
230
231       decode_bodies [YESNO]
232           Instance method.  Controls whether the parser should decode entity
233           bodies or not.  If this is set to a false value (default is true),
234           all entity bodies will be kept as-is in the original content-trans‐
235           fer encoding.
236
237           To prevent double encoding on the output side
238           MIME::Body->is_encoded is set, which tells MIME::Body not to encode
239           the data again, if encoded data was requested. This is in particu‐
240           lar useful, when it's important that the content must not be modi‐
241           fied, e.g. if you want to calculate OpenPGP signatures from it.
242
243           WARNING: the semantics change significantly if you parse MIME mes‐
244           sages with this option set, because MIME::Entity resp. MIME::Body
245           *always* see encoded data now, while the default behaviour is work‐
246           ing with *decoded* data (and encoding it only if you request it).
247           You need to decode the data yourself, if you want to have it
248           decoded.
249
250           So use this option only if you exactly know, what you're doing, and
251           that you're sure, that you really need it.
252
253       Parsing an input source
254
255       parse_data DATA
256           Instance method.  Parse a MIME message that's already in core.  You
257           may supply the DATA in any of a number of ways...
258
259           *   A scalar which holds the message.
260
261           *   A ref to a scalar which holds the message.  This is an effi‐
262               ciency hack.
263
264           *   A ref to an array of scalars.  They are treated as a stream
265               which (conceptually) consists of simply concatenating the
266               scalars.
267
268           Returns the parsed MIME::Entity on success.
269
270       parse INSTREAM
271           Instance method.  Takes a MIME-stream and splits it into its compo‐
272           nent entities.
273
274           The INSTREAM can be given as a readable FileHandle, an IO::File, a
275           globref filehandle (like "\*STDIN"), or as any blessed object con‐
276           forming to the IO:: interface (which minimally implements getline()
277           and read()).
278
279           Returns the parsed MIME::Entity on success.  Throws exception on
280           failure.  If the message contained too many parts (as set by
281           max_parts), returns undef.
282
283       parse_open EXPR
284           Instance method.  Convenience front-end onto "parse()".  Simply
285           give this method any expression that may be sent as the second
286           argument to open() to open a filehandle for reading.
287
288           Returns the parsed MIME::Entity on success.  Throws exception on
289           failure.
290
291       parse_two HEADFILE, BODYFILE
292           Instance method.  Convenience front-end onto "parse_open()",
293           intended for programs running under mail-handlers like deliver,
294           which splits the incoming mail message into a header file and a
295           body file.  Simply give this method the paths to the respective
296           files.
297
298           Warning: it is assumed that, once the files are cat'ed together,
299           there will be a blank line separating the head part and the body
300           part.
301
302           Warning: new implementation slurps files into line array for porta‐
303           bility, instead of using 'cat'.  May be an issue if your messages
304           are large.
305
306           Returns the parsed MIME::Entity on success.  Throws exception on
307           failure.
308
309       Specifying output destination
310
311       Warning: in 5.212 and before, this was done by methods of MIME::Parser.
312       However, since many users have requested fine-tuned control over how
313       this is done, the logic has been split off from the parser into its own
314       class, MIME::Parser::Filer Every MIME::Parser maintains an instance of
315       a MIME::Parser::Filer subclass to manage disk output (see
316       MIME::Parser::Filer for details.)
317
318       The benefit to this is that the MIME::Parser code won't be confounded
319       with a lot of garbage related to disk output.  The drawback is that the
320       way you override the default behavior will change.
321
322       For now, all the normal public-interface methods are still provided,
323       but many are only stubs which create or delegate to the underlying
324       MIME::Parser::Filer object.
325
326       filer [FILER]
327           Instance method.  Get/set the FILER object used to manage the out‐
328           put of files to disk.  This will be some subclass of
329           MIME::Parser::Filer.
330
331       output_dir DIRECTORY
332           Instance method.  Causes messages to be filed directly into the
333           given DIRECTORY.  It does this by setting the underlying filer() to
334           a new instance of MIME::Parser::FileInto, and passing the arguments
335           into that class' new() method.
336
337           Note: Since this method replaces the underlying filer, you must
338           invoke it before doing changing any attributes of the filer, like
339           the output prefix; otherwise those changes will be lost.
340
341       output_under BASEDIR, OPTS...
342           Instance method.  Causes messages to be filed directly into subdi‐
343           rectories of the given BASEDIR, one subdirectory per message.  It
344           does this by setting the underlying filer() to a new instance of
345           MIME::Parser::FileUnder, and passing the arguments into that class'
346           new() method.
347
348           Note: Since this method replaces the underlying filer, you must
349           invoke it before doing changing any attributes of the filer, like
350           the output prefix; otherwise those changes will be lost.
351
352       output_path HEAD
353           Instance method, DEPRECATED.  Given a MIME head for a file to be
354           extracted, come up with a good output pathname for the extracted
355           file.  Identical to the preferred form:
356
357                $parser->filer->output_path(...args...);
358
359           We just delegate this to the underlying filer() object.
360
361       output_prefix [PREFIX]
362           Instance method, DEPRECATED.  Get/set the short string that all
363           filenames for extracted body-parts will begin with (assuming that
364           there is no better "recommended filename").  Identical to the pre‐
365           ferred form:
366
367                $parser->filer->output_prefix(...args...);
368
369           We just delegate this to the underlying filer() object.
370
371       evil_filename NAME
372           Instance method, DEPRECATED.  Identical to the preferred form:
373
374                $parser->filer->evil_filename(...args...);
375
376           We just delegate this to the underlying filer() object.
377
378       max_parts NUM
379           Instance method.  Limits the number of MIME parts we will parse.
380
381           Normally, instances of this class parse a message to the bitter
382           end.  Messages with many MIME parts can cause excessive memory con‐
383           sumption.  If you invoke this method, parsing will abort with a
384           die() if a message contains more than NUM parts.
385
386           If NUM is set to -1 (the default), then no maximum limit is
387           enforced.
388
389           With no argument, returns the current setting as an integer
390
391       output_to_core YESNO
392           Instance method.  Normally, instances of this class output all
393           their decoded body data to disk files (via MIME::Body::File).  How‐
394           ever, you can change this behaviour by invoking this method before
395           parsing:
396
397           If YESNO is false (the default), then all body data goes to disk
398           files.
399
400           If YESNO is true, then all body data goes to in-core data struc‐
401           tures This is a little risky (what if someone emails you an MPEG or
402           a tar file, hmmm?) but people seem to want this bit of noose-shaped
403           rope, so I'm providing it.  Note that setting this attribute true
404           does not mean that parser-internal temporary files are avoided!
405           Use tmp_to_core() for that.
406
407           With no argument, returns the current setting as a boolean.
408
409       tmp_recycling [YESNO]
410           Instance method.  Normally, tmpfiles are created when needed during
411           parsing, and destroyed automatically when they go out of scope.
412           But for efficiency, you might prefer for your parser to attempt to
413           rewind and reuse the same file until the parser itself is
414           destroyed.
415
416           If YESNO is true (the default), we allow recycling; tmpfiles per‐
417           sist until the parser itself is destroyed.  If YESNO is false, we
418           do not allow recycling; tmpfiles persist only as long as they are
419           needed during the parse.  With no argument, just returns the cur‐
420           rent setting.
421
422       tmp_to_core [YESNO]
423           Instance method.  Should new_tmpfile() create real temp files, or
424           use fake in-core ones?  Normally we allow the creation of temporary
425           disk files, since this allows us to handle huge attachments even
426           when core is limited.
427
428           If YESNO is true, we implement new_tmpfile() via in-core handles.
429           If YESNO is false (the default), we use real tmpfiles.  With no
430           argument, just returns the current setting.
431
432       use_inner_files [YESNO]
433           Instance method.  If you are parsing from a handle which supports
434           seek() and tell(), then we can avoid tmpfiles completely by using
435           IO::InnerFile, if so desired: basically, we simulate a temporary
436           file via pointers to virtual start- and end-positions in the input
437           stream.
438
439           If YESNO is false (the default), then we will not use IO::Inner‐
440           File.  If YESNO is true, we use IO::InnerFile if we can.  With no
441           argument, just returns the current setting.
442
443           Note: inner files are slower than real tmpfiles, but possibly
444           faster than in-core tmpfiles... so your choice for this option will
445           probably depend on your choice for tmp_to_core() and the kind of
446           input streams you are parsing.
447
448       Specifying classes to be instantiated
449
450       interface ROLE,[VALUE]
451           Instance method.  During parsing, the parser normally creates
452           instances of certain classes, like MIME::Entity.  However, you may
453           want to create a parser subclass that uses your own experimental
454           head, entity, etc. classes (for example, your "head" class may pro‐
455           vide some additional MIME-field-oriented methods).
456
457           If so, then this is the method that your subclass should invoke
458           during init.  Use it like this:
459
460               package MyParser;
461               @ISA = qw(MIME::Parser);
462               ...
463               sub init {
464                   my $self = shift;
465                   $self->SUPER::init(@_);        ### do my parent's init
466                   $self->interface(ENTITY_CLASS => 'MIME::MyEntity');
467                   $self->interface(HEAD_CLASS   => 'MIME::MyHead');
468                   $self;                         ### return
469               }
470
471           With no VALUE, returns the VALUE currently associated with that
472           ROLE.
473
474       new_body_for HEAD
475           Instance method.  Based on the HEAD of a part we are parsing,
476           return a new body object (any desirable subclass of MIME::Body) for
477           receiving that part's data.
478
479           If you set the "output_to_core" option to false before parsing (the
480           default), then we call "output_path()" and create a new
481           MIME::Body::File on that filename.
482
483           If you set the "output_to_core" option to true before parsing, then
484           you get a MIME::Body::InCore instead.
485
486           If you want the parser to do something else entirely, you can over‐
487           ride this method in a subclass.
488
489       new_tmpfile [RECYCLE]
490           Instance method.  Return an IO handle to be used to hold temporary
491           data during a parse.  The default uses the standard
492           IO::File->new_tmpfile() method unless tmp_to_core() dictates other‐
493           wise, but you can override this.  You shouldn't need to.
494
495           If you do override this, make certain that the object you return is
496           set for binmode(), and is able to handle the following methods:
497
498               read(BUF, NBYTES)
499               getline()
500               getlines()
501               print(@ARGS)
502               flush()
503               seek(0, 0)
504
505           Fatal exception if the stream could not be established.
506
507           If RECYCLE is given, it is an object returned by a previous invoca‐
508           tion of this method; to recycle it, this method must effectively
509           rewind and truncate it, and return the same object.  If you don't
510           want to support recycling, just ignore it and always return a new
511           object.
512
513       Parse results and error recovery
514
515       last_error
516           Instance method.  Return the error (if any) that we ignored in the
517           last parse.
518
519       last_head
520           Instance method.  Return the top-level MIME header of the last
521           stream we attempted to parse.  This is useful for replying to peo‐
522           ple who sent us bad MIME messages.
523
524               ### Parse an input stream:
525               eval { $entity = $parser->parse(\*STDIN) };
526               if (!$entity) {    ### parse failed!
527                   my $decapitated = $parser->last_head;
528                   ...
529               }
530
531       results
532           Instance method.  Return an object containing lots of info from the
533           last entity parsed.  This will be an instance of class
534           MIME::Parser::Results.
535

OPTIMIZING YOUR PARSER

537       Maximizing speed
538
539       Optimum input mechanisms:
540
541           parse()                    YES (if you give it a globref or a
542                                           subclass of IO::File)
543           parse_open()               YES
544           parse_data()               NO  (see below)
545           parse_two()                NO  (see below)
546
547       Optimum settings:
548
549           decode_headers()           *** (no real difference; 0 is slightly faster)
550           extract_nested_messages()  0   (may be slightly faster, but in
551                                           general you want it set to 1)
552           output_to_core()           0   (will be MUCH faster)
553           tmp_recycling()            1?  (probably, but should be investigated)
554           tmp_to_core()              0   (will be MUCH faster)
555           use_inner_files()          0   (if tmp_to_core() is 0;
556                                           use 1 otherwise)
557
558       File I/O is much faster than in-core I/O.  Although it seems like
559       slurping a message into core and processing it in-core should be
560       faster... it isn't.  Reason: Perl's filehandle-based I/O translates
561       directly into native operating-system calls, whereas the in-core I/O is
562       implemented in Perl.
563
564       Inner files are slower than real tmpfiles, but faster than in-core
565       ones.  If speed is your concern, that's why you should set
566       use_inner_files(true) if you set tmp_to_core(true): so that we can
567       bypass the slow in-core tmpfiles if the input stream permits.
568
569       Native I/O is much faster than object-oriented I/O.  It's much faster
570       to use <$foo> than $foo->getline.  For backwards compatibilty, this
571       module must continue to use object-oriented I/O in most places, but if
572       you use parse() with a "real" filehandle (string, globref, or subclass
573       of IO::File) then MIME::Parser is able to perform some crucial opti‐
574       mizations.
575
576       The parse_two() call is very inefficient.  Currently this is just a
577       front-end onto parse_data().  If your OS supports it, you're far better
578       off doing something like:
579
580           $parser->parse_open("/bin/cat msg.head msg.body ⎪");
581
582       Minimizing memory
583
584       Optimum input mechanisms:
585
586           parse()                    YES
587           parse_open()               YES
588           parse_data()               NO  (in-core I/O will burn core)
589           parse_two()                NO  (in-core I/O will burn core)
590
591       Optimum settings:
592
593           decode_headers()           *** (no real difference)
594           extract_nested_messages()  *** (no real difference)
595           output_to_core()           0   (will use MUCH less memory)
596           tmp_recycling()            0?  (promotes faster GC if
597                                           tmp_to_core is 1)
598           tmp_to_core()              0   (will use MUCH less memory)
599           use_inner_files()          *** (no real difference, but set it to 1
600                                           if you *must* have tmp_to_core set to 1,
601                                           so that you avoid in-core tmpfiles)
602
603       Maximizing tolerance of bad MIME
604
605       Optimum input mechanisms:
606
607           parse()                    *** (doesn't matter)
608           parse_open()               *** (doesn't matter)
609           parse_data()               *** (doesn't matter)
610           parse_two()                *** (doesn't matter)
611
612       Optimum settings:
613
614           decode_headers()           0   (sidesteps problem of bad hdr encodings)
615           extract_nested_messages()  0   (sidesteps problems of bad nested messages,
616                                           but often you want it set to 1 anyway).
617           output_to_core()           *** (doesn't matter)
618           tmp_recycling()            *** (doesn't matter)
619           tmp_to_core()              *** (doesn't matter)
620           use_inner_files()          *** (doesn't matter)
621
622       Avoiding disk-based temporary files
623
624       Optimum input mechanisms:
625
626           parse()                    YES (if you give it a seekable handle)
627           parse_open()               YES (becomes a seekable handle)
628           parse_data()               NO  (unless you set tmp_to_core(1))
629           parse_two()                NO  (unless you set tmp_to_core(1))
630
631       Optimum settings:
632
633           decode_headers()           *** (doesn't matter)
634           extract_nested_messages()  *** (doesn't matter)
635           output_to_core()           *** (doesn't matter)
636           tmp_recycling              1   (restricts created files to 1 per parser)
637           tmp_to_core()              1
638           use_inner_files()          1
639
640       If we can use them, inner files avoid most tmpfiles.  If you parse from
641       a seekable-and-tellable filehandle, then the internal
642       process_to_bound() doesn't need to extract each part into a temporary
643       buffer; it can use IO::InnerFile (warning: this will slow down the
644       parsing of messages with large attachments).
645
646       You can veto tmpfiles entirely.  If you might not be parsing from a
647       seekable-and-tellable filehandle, you can set tmp_to_core() true: this
648       will always use in-core I/O for the buffering (warning: this will slow
649       down the parsing of messages with large attachments).
650
651       Final resort.  You can always override new_tmpfile() in a subclass.
652

WARNINGS

654       Multipart messages are always read line-by-line
655           Multipart document parts are read line-by-line, so that the encap‐
656           sulation boundaries may easily be detected.  However, bad MIME com‐
657           position agents (for example, naive CGI scripts) might return mul‐
658           tipart documents where the parts are, say, unencoded bitmap
659           files... and, consequently, where such "lines" might be
660           veeeeeeeeery long indeed.
661
662           A better solution for this case would be to set up some form of
663           state machine for input processing.  This will be left for future
664           versions.
665
666       Multipart parts read into temp files before decoding
667           In my original implementation, the MIME::Decoder classes had to be
668           aware of encapsulation boundaries in multipart MIME documents.
669           While this decode-while-parsing approach obviated the need for tem‐
670           porary files, it resulted in inflexible and complex decoder imple‐
671           mentations.
672
673           The revised implementation uses a temporary file (a la "tmpfile()")
674           during parsing to hold the encoded portion of the current MIME doc‐
675           ument or part.  This file is deleted automatically after the cur‐
676           rent part is decoded and the data is written to the "body stream"
677           object; you'll never see it, and should never need to worry about
678           it.
679
680           Some folks have asked for the ability to bypass this temp-file
681           mechanism, I suppose because they assume it would slow down their
682           application.  I considered accomodating this wish, but the temp-
683           file approach solves a lot of thorny problems in parsing, and it
684           also protects against hidden bugs in user applications (what if
685           you've directed the encoded part into a scalar, and someone unex‐
686           pectedly sends you a 6 MB tar file?).  Finally, I'm just not con‐
687           viced that the temp-file use adds significant overhead.
688
689       Fuzzing of CRLF and newline on input
690           RFC-1521 dictates that MIME streams have lines terminated by CRLF
691           ("\r\n").  However, it is extremely likely that folks will want to
692           parse MIME streams where each line ends in the local newline char‐
693           acter "\n" instead.
694
695           An attempt has been made to allow the parser to handle both CRLF
696           and newline-terminated input.
697
698       Fuzzing of CRLF and newline on output
699           The "7bit" and "8bit" decoders will decode both a "\n" and a "\r\n"
700           end-of-line sequence into a "\n".
701
702           The "binary" decoder (default if no encoding specified) still out‐
703           puts stuff verbatim... so a MIME message with CRLFs and no explicit
704           encoding will be output as a text file that, on many systems, will
705           have an annoying ^M at the end of each line... but this is as it
706           should be.
707
708       Inability to handle multipart boundaries that contain newlines
709           First, let's get something straight: this is an evil, EVIL prac‐
710           tice, and is incompatible with RFC-1521... hence, it's not valid
711           MIME.
712
713           If your mailer creates multipart boundary strings that contain new‐
714           lines when they appear in the message body, give it two weeks
715           notice and find another one.  If your mail robot receives MIME mail
716           like this, regard it as syntactically incorrect MIME, which it is.
717
718           Why do I say that?  Well, in RFC-1521, the syntax of a boundary is
719           given quite clearly:
720
721                 boundary := 0*69<bchars> bcharsnospace
722
723                 bchars := bcharsnospace / " "
724
725                 bcharsnospace :=    DIGIT / ALPHA / "'" / "(" / ")" / "+" /"_"
726                              / "," / "-" / "." / "/" / ":" / "=" / "?"
727
728           All of which means that a valid boundary string cannot have new‐
729           lines in it, and any newlines in such a string in the message
730           header are expected to be solely the result of folding the string
731           (i.e., inserting to-be-removed newlines for readability and line-
732           shortening only).
733
734           Yet, there is at least one brain-damaged user agent out there that
735           composes mail like this:
736
737                 MIME-Version: 1.0
738                 Content-type: multipart/mixed; boundary="----ABC-
739                  123----"
740                 Subject: Hi... I'm a dork!
741
742                 This is a multipart MIME message (yeah, right...)
743
744                 ----ABC-
745                  123----
746
747                 Hi there!
748
749           We have got to discourage practices like this (and the recent file
750           upload idiocy where binary files that are part of a multipart MIME
751           message aren't base64-encoded) if we want MIME to stay relatively
752           simple, and MIME parsers to be relatively robust.
753
754           Thanks to Andreas Koenig for bringing a baaaaaaaaad user agent to
755           my attention.
756

AUTHOR

758       Eryq (eryq@zeegee.com), ZeeGee Software Inc (http://www.zeegee.com).
759       David F. Skoll (dfs@roaringpenguin.com) http://www.roaringpenguin.com
760
761       All rights reserved.  This program is free software; you can redis‐
762       tribute it and/or modify it under the same terms as Perl itself.
763

VERSION

765       $Revision: 1.20 $ $Date: 2006/03/17 21:03:23 $
766
767
768
769perl v5.8.8                       2006-03-17                   MIME::Parser(3)
Impressum