1MIME::Parser(3)       User Contributed Perl Documentation      MIME::Parser(3)
2
3
4

NAME

6       MIME::Parser - experimental class for parsing MIME streams
7

SYNOPSIS

9       Before reading further, you should see MIME::Tools to make sure that
10       you understand where this module fits into the grand scheme of things.
11       Go on, do it now.  I'll wait.
12
13       Ready?  Ok...
14
15   Basic usage examples
16           ### Create a new parser object:
17           my $parser = new MIME::Parser;
18
19           ### Tell it where to put things:
20           $parser->output_under("/tmp");
21
22           ### Parse an input filehandle:
23           $entity = $parser->parse(\*STDIN);
24
25           ### Congratulations: you now have a (possibly multipart) MIME entity!
26           $entity->dump_skeleton;          # for debugging
27
28   Examples of input
29           ### Parse from filehandles:
30           $entity = $parser->parse(\*STDIN);
31           $entity = $parser->parse(IO::File->new("some command|");
32
33           ### Parse from any object that supports getline() and read():
34           $entity = $parser->parse($myHandle);
35
36           ### Parse an in-core MIME message:
37           $entity = $parser->parse_data($message);
38
39           ### Parse an MIME message in a file:
40           $entity = $parser->parse_open("/some/file.msg");
41
42           ### Parse an MIME message out of a pipeline:
43           $entity = $parser->parse_open("gunzip - < file.msg.gz |");
44
45           ### Parse already-split input (as "deliver" would give it to you):
46           $entity = $parser->parse_two("msg.head", "msg.body");
47
48   Examples of output control
49           ### Keep parsed message bodies in core (default outputs to disk):
50           $parser->output_to_core(1);
51
52           ### Output each message body to a one-per-message directory:
53           $parser->output_under("/tmp");
54
55           ### Output each message body to the same directory:
56           $parser->output_dir("/tmp");
57
58           ### Change how nameless message-component files are named:
59           $parser->output_prefix("msg");
60
61           ### Put temporary files somewhere else
62           $parser->tmp_dir("/var/tmp/mytmpdir");
63
64   Examples of error recovery
65           ### Normal mechanism:
66           eval { $entity = $parser->parse(\*STDIN) };
67           if ($@) {
68               $results  = $parser->results;
69               $decapitated = $parser->last_head;  ### get last top-level head
70           }
71
72           ### Ultra-tolerant mechanism:
73           $parser->ignore_errors(1);
74           $entity = eval { $parser->parse(\*STDIN) };
75           $error = ($@ || $parser->last_error);
76
77           ### Cleanup all files created by the parse:
78           eval { $entity = $parser->parse(\*STDIN) };
79           ...
80           $parser->filer->purge;
81
82   Examples of parser options
83           ### Automatically attempt to RFC 2047-decode the MIME headers?
84           $parser->decode_headers(1);             ### default is false
85
86           ### Parse contained "message/rfc822" objects as nested MIME streams?
87           $parser->extract_nested_messages(0);    ### default is true
88
89           ### Look for uuencode in "text" messages, and extract it?
90           $parser->extract_uuencode(1);           ### default is false
91
92           ### Should we forgive normally-fatal errors?
93           $parser->ignore_errors(0);              ### default is true
94
95   Miscellaneous examples
96           ### Convert a Mail::Internet object to a MIME::Entity:
97           @lines = (@{$mail->header}, "\n", @{$mail->body});
98           $entity = $parser->parse_data(\@lines);
99

DESCRIPTION

101       You can inherit from this class to create your own subclasses that
102       parse MIME streams into MIME::Entity objects.
103

PUBLIC INTERFACE

105   Construction
106       new ARGS...
107           Class method.  Create a new parser object.  Once you do this, you
108           can then set up various parameters before doing the actual parsing.
109           For example:
110
111               my $parser = new MIME::Parser;
112               $parser->output_dir("/tmp");
113               $parser->output_prefix("msg1");
114               my $entity = $parser->parse(\*STDIN);
115
116           Any arguments are passed into "init()".  Don't override this in
117           your subclasses; override init() instead.
118
119       init ARGS...
120           Instance method.  Initiallize a new MIME::Parser object.  This is
121           automatically sent to a new object; you may want to override it.
122           If you override this, be sure to invoke the inherited method.
123
124       init_parse
125           Instance method.  Invoked automatically whenever one of the top-
126           level parse() methods is called, to reset the parser to a "ready"
127           state.
128
129   Altering how messages are parsed
130       decode_headers [YESNO]
131           Instance method.  Controls whether the parser will attempt to
132           decode all the MIME headers (as per RFC 2047) the moment it sees
133           them.  This is not advisable for two very important reasons:
134
135           ·   It screws up the extraction of information from MIME fields.
136               If you fully decode the headers into bytes, you can
137               inadvertently transform a parseable MIME header like this:
138
139                   Content-type: text/plain; filename="=?ISO-8859-1?Q?Hi=22Ho?="
140
141               into unparseable gobbledygook; in this case:
142
143                   Content-type: text/plain; filename="Hi"Ho"
144
145           ·   It is information-lossy.  An encoded string which contains both
146               Latin-1 and Cyrillic characters will be turned into a binary
147               mishmosh which simply can't be rendered.
148
149           History.  This method was once the only out-of-the-box way to deal
150           with attachments whose filenames had non-ASCII characters.
151           However, since MIME-tools 5.4xx this is no longer necessary.
152
153           Parameters.  If YESNO is true, decoding is done.  However, you will
154           get a warning unless you use one of the special "true" values:
155
156              "I_NEED_TO_FIX_THIS"
157                     Just shut up and do it.  Not recommended.
158                     Provided only for those who need to keep old scripts functioning.
159
160              "I_KNOW_WHAT_I_AM_DOING"
161                     Just shut up and do it.  Not recommended.
162                     Provided for those who REALLY know what they are doing.
163
164           If YESNO is false (the default), no attempt at decoding will be
165           done.  With no argument, just returns the current setting.
166           Remember: you can always decode the headers after the parsing has
167           completed (see MIME::Head::decode()), or decode the words on demand
168           (see MIME::Words).
169
170       extract_nested_messages OPTION
171           Instance method.  Some MIME messages will contain a part of type
172           "message/rfc822" ,"message/partial" or "message/external-body":
173           literally, the text of an embedded mail/news/whatever message.
174           This option controls whether (and how) we parse that embedded
175           message.
176
177           If the OPTION is false, we treat such a message just as if it were
178           a "text/plain" document, without attempting to decode its contents.
179
180           If the OPTION is true (the default), the body of the
181           "message/rfc822" or "message/partial" part is parsed by this
182           parser, creating an entity object.  What happens then is determined
183           by the actual OPTION:
184
185           NEST or 1
186               The default setting.  The contained message becomes the sole
187               "part" of the "message/rfc822" entity (as if the containing
188               message were a special kind of "multipart" message).  You can
189               recover the sub-entity by invoking the parts() method on the
190               "message/rfc822" entity.
191
192           REPLACE
193               The contained message replaces the "message/rfc822" entity, as
194               though the "message/rfc822" "container" never existed.
195
196               Warning: notice that, with this option, all the header
197               information in the "message/rfc822" header is lost.  This might
198               seriously bother you if you're dealing with a top-level
199               message, and you've just lost the sender's address and the
200               subject line.  ":-/".
201
202           Thanks to Andreas Koenig for suggesting this method.
203
204       extract_uuencode [YESNO]
205           Instance method.  If set true, then whenever we are confronted with
206           a message whose effective content-type is "text/plain" and whose
207           encoding is 7bit/8bit/binary, we scan the encoded body to see if it
208           contains uuencoded data (generally given away by a "begin XXX"
209           line).
210
211           If it does, we explode the uuencoded message into a multipart,
212           where the text before the first "begin XXX" becomes the first part,
213           and all "begin...end" sections following become the subsequent
214           parts.  The filename (if given) is accessible through the normal
215           means.
216
217       ignore_errors [YESNO]
218           Instance method.  Controls whether the parser will attempt to
219           ignore normally-fatal errors, treating them as warnings and
220           continuing with the parse.
221
222           If YESNO is true (the default), many syntax errors are tolerated.
223           If YESNO is false, fatal errors throw exceptions.  With no
224           argument, just returns the current setting.
225
226       decode_bodies [YESNO]
227           Instance method.  Controls whether the parser should decode entity
228           bodies or not.  If this is set to a false value (default is true),
229           all entity bodies will be kept as-is in the original content-
230           transfer encoding.
231
232           To prevent double encoding on the output side
233           MIME::Body->is_encoded is set, which tells MIME::Body not to encode
234           the data again, if encoded data was requested. This is in
235           particular useful, when it's important that the content must not be
236           modified, e.g. if you want to calculate OpenPGP signatures from it.
237
238           WARNING: the semantics change significantly if you parse MIME
239           messages with this option set, because MIME::Entity resp.
240           MIME::Body *always* see encoded data now, while the default
241           behaviour is working with *decoded* data (and encoding it only if
242           you request it).  You need to decode the data yourself, if you want
243           to have it decoded.
244
245           So use this option only if you exactly know, what you're doing, and
246           that you're sure, that you really need it.
247
248   Parsing an input source
249       parse_data DATA
250           Instance method.  Parse a MIME message that's already in core.  You
251           may supply the DATA in any of a number of ways...
252
253           ·   A scalar which holds the message.
254
255           ·   A ref to a scalar which holds the message.  This is an
256               efficiency hack.
257
258           ·   A ref to an array of scalars.  They are treated as a stream
259               which (conceptually) consists of simply concatenating the
260               scalars.
261
262           Returns the parsed MIME::Entity on success.
263
264       parse INSTREAM
265           Instance method.  Takes a MIME-stream and splits it into its
266           component entities.
267
268           The INSTREAM can be given as an IO::File, a globref filehandle
269           (like "\*STDIN"), or as any blessed object conforming to the IO::
270           interface (which minimally implements getline() and read()).
271
272           Returns the parsed MIME::Entity on success.  Throws exception on
273           failure.  If the message contained too many parts (as set by
274           max_parts), returns undef.
275
276       parse_open EXPR
277           Instance method.  Convenience front-end onto "parse()".  Simply
278           give this method any expression that may be sent as the second
279           argument to open() to open a filehandle for reading.
280
281           Returns the parsed MIME::Entity on success.  Throws exception on
282           failure.
283
284       parse_two HEADFILE, BODYFILE
285           Instance method.  Convenience front-end onto "parse_open()",
286           intended for programs running under mail-handlers like deliver,
287           which splits the incoming mail message into a header file and a
288           body file.  Simply give this method the paths to the respective
289           files.
290
291           Warning: it is assumed that, once the files are cat'ed together,
292           there will be a blank line separating the head part and the body
293           part.
294
295           Warning: new implementation slurps files into line array for
296           portability, instead of using 'cat'.  May be an issue if your
297           messages are large.
298
299           Returns the parsed MIME::Entity on success.  Throws exception on
300           failure.
301
302   Specifying output destination
303       Warning: in 5.212 and before, this was done by methods of MIME::Parser.
304       However, since many users have requested fine-tuned control over how
305       this is done, the logic has been split off from the parser into its own
306       class, MIME::Parser::Filer Every MIME::Parser maintains an instance of
307       a MIME::Parser::Filer subclass to manage disk output (see
308       MIME::Parser::Filer for details.)
309
310       The benefit to this is that the MIME::Parser code won't be confounded
311       with a lot of garbage related to disk output.  The drawback is that the
312       way you override the default behavior will change.
313
314       For now, all the normal public-interface methods are still provided,
315       but many are only stubs which create or delegate to the underlying
316       MIME::Parser::Filer object.
317
318       filer [FILER]
319           Instance method.  Get/set the FILER object used to manage the
320           output of files to disk.  This will be some subclass of
321           MIME::Parser::Filer.
322
323       output_dir DIRECTORY
324           Instance method.  Causes messages to be filed directly into the
325           given DIRECTORY.  It does this by setting the underlying filer() to
326           a new instance of MIME::Parser::FileInto, and passing the arguments
327           into that class' new() method.
328
329           Note: Since this method replaces the underlying filer, you must
330           invoke it before doing changing any attributes of the filer, like
331           the output prefix; otherwise those changes will be lost.
332
333       output_under BASEDIR, OPTS...
334           Instance method.  Causes messages to be filed directly into
335           subdirectories of the given BASEDIR, one subdirectory per message.
336           It does this by setting the underlying filer() to a new instance of
337           MIME::Parser::FileUnder, and passing the arguments into that class'
338           new() method.
339
340           Note: Since this method replaces the underlying filer, you must
341           invoke it before doing changing any attributes of the filer, like
342           the output prefix; otherwise those changes will be lost.
343
344       output_path HEAD
345           Instance method, DEPRECATED.  Given a MIME head for a file to be
346           extracted, come up with a good output pathname for the extracted
347           file.  Identical to the preferred form:
348
349                $parser->filer->output_path(...args...);
350
351           We just delegate this to the underlying filer() object.
352
353       output_prefix [PREFIX]
354           Instance method, DEPRECATED.  Get/set the short string that all
355           filenames for extracted body-parts will begin with (assuming that
356           there is no better "recommended filename").  Identical to the
357           preferred form:
358
359                $parser->filer->output_prefix(...args...);
360
361           We just delegate this to the underlying filer() object.
362
363       evil_filename NAME
364           Instance method, DEPRECATED.  Identical to the preferred form:
365
366                $parser->filer->evil_filename(...args...);
367
368           We just delegate this to the underlying filer() object.
369
370       max_parts NUM
371           Instance method.  Limits the number of MIME parts we will parse.
372
373           Normally, instances of this class parse a message to the bitter
374           end.  Messages with many MIME parts can cause excessive memory
375           consumption.  If you invoke this method, parsing will abort with a
376           die() if a message contains more than NUM parts.
377
378           If NUM is set to -1 (the default), then no maximum limit is
379           enforced.
380
381           With no argument, returns the current setting as an integer
382
383       output_to_core YESNO
384           Instance method.  Normally, instances of this class output all
385           their decoded body data to disk files (via MIME::Body::File).
386           However, you can change this behaviour by invoking this method
387           before parsing:
388
389           If YESNO is false (the default), then all body data goes to disk
390           files.
391
392           If YESNO is true, then all body data goes to in-core data
393           structures This is a little risky (what if someone emails you an
394           MPEG or a tar file, hmmm?) but people seem to want this bit of
395           noose-shaped rope, so I'm providing it.  Note that setting this
396           attribute true does not mean that parser-internal temporary files
397           are avoided!  Use tmp_to_core() for that.
398
399           With no argument, returns the current setting as a boolean.
400
401       tmp_recycling
402           Instance method, DEPRECATED.
403
404           This method is a no-op to preserve the pre-5.421 API.
405
406           The tmp_recycling() feature was removed in 5.421 because it had
407           never actually worked.  Please update your code to stop using it.
408
409       tmp_to_core [YESNO]
410           Instance method.  Should new_tmpfile() create real temp files, or
411           use fake in-core ones?  Normally we allow the creation of temporary
412           disk files, since this allows us to handle huge attachments even
413           when core is limited.
414
415           If YESNO is true, we implement new_tmpfile() via in-core handles.
416           If YESNO is false (the default), we use real tmpfiles.  With no
417           argument, just returns the current setting.
418
419       use_inner_files [YESNO]
420           Instance method.  If you are parsing from a handle which supports
421           seek() and tell(), then we can avoid tmpfiles completely by using
422           IO::InnerFile, if so desired: basically, we simulate a temporary
423           file via pointers to virtual start- and end-positions in the input
424           stream.
425
426           If YESNO is false (the default), then we will not use
427           IO::InnerFile.  If YESNO is true, we use IO::InnerFile if we can.
428           With no argument, just returns the current setting.
429
430           Note: inner files are slower than real tmpfiles, but possibly
431           faster than in-core tmpfiles... so your choice for this option will
432           probably depend on your choice for tmp_to_core() and the kind of
433           input streams you are parsing.
434
435   Specifying classes to be instantiated
436       interface ROLE,[VALUE]
437           Instance method.  During parsing, the parser normally creates
438           instances of certain classes, like MIME::Entity.  However, you may
439           want to create a parser subclass that uses your own experimental
440           head, entity, etc. classes (for example, your "head" class may
441           provide some additional MIME-field-oriented methods).
442
443           If so, then this is the method that your subclass should invoke
444           during init.  Use it like this:
445
446               package MyParser;
447               @ISA = qw(MIME::Parser);
448               ...
449               sub init {
450                   my $self = shift;
451                   $self->SUPER::init(@_);        ### do my parent's init
452                   $self->interface(ENTITY_CLASS => 'MIME::MyEntity');
453                   $self->interface(HEAD_CLASS   => 'MIME::MyHead');
454                   $self;                         ### return
455               }
456
457           With no VALUE, returns the VALUE currently associated with that
458           ROLE.
459
460       new_body_for HEAD
461           Instance method.  Based on the HEAD of a part we are parsing,
462           return a new body object (any desirable subclass of MIME::Body) for
463           receiving that part's data.
464
465           If you set the "output_to_core" option to false before parsing (the
466           default), then we call "output_path()" and create a new
467           MIME::Body::File on that filename.
468
469           If you set the "output_to_core" option to true before parsing, then
470           you get a MIME::Body::InCore instead.
471
472           If you want the parser to do something else entirely, you can
473           override this method in a subclass.
474
475   Temporary File Creation
476       tmp_dir DIRECTORY
477           Instance method.  Causes any temporary files created by this parser
478           to be created in the given DIRECTORY.
479
480           If called without arguments, returns current value.
481
482           The default value is undef, which will cause new_tmpfile() to use
483           the system default temporary directory.
484
485       new_tmpfile
486           Instance method.  Return an IO handle to be used to hold temporary
487           data during a parse.
488
489           The default uses MIME::Tools::tmpopen() to create a new temporary
490           file, unless tmp_to_core() dictates otherwise, but you can override
491           this.  You shouldn't need to.
492
493           The location for temporary files can be changed on a per-parser
494           basis with tmp_dir().
495
496           If you do override this, make certain that the object you return is
497           set for binmode(), and is able to handle the following methods:
498
499               read(BUF, NBYTES)
500               getline()
501               getlines()
502               print(@ARGS)
503               flush()
504               seek(0, 0)
505
506           Fatal exception if the stream could not be established.
507
508   Parse results and error recovery
509       last_error
510           Instance method.  Return the error (if any) that we ignored in the
511           last parse.
512
513       last_head
514           Instance method.  Return the top-level MIME header of the last
515           stream we attempted to parse.  This is useful for replying to
516           people who sent us bad MIME messages.
517
518               ### Parse an input stream:
519               eval { $entity = $parser->parse(\*STDIN) };
520               if (!$entity) {    ### parse failed!
521                   my $decapitated = $parser->last_head;
522                   ...
523               }
524
525       results
526           Instance method.  Return an object containing lots of info from the
527           last entity parsed.  This will be an instance of class
528           MIME::Parser::Results.
529

OPTIMIZING YOUR PARSER

531   Maximizing speed
532       Optimum input mechanisms:
533
534           parse()                    YES (if you give it a globref or a
535                                           subclass of IO::File)
536           parse_open()               YES
537           parse_data()               NO  (see below)
538           parse_two()                NO  (see below)
539
540       Optimum settings:
541
542           decode_headers()           *** (no real difference; 0 is slightly faster)
543           extract_nested_messages()  0   (may be slightly faster, but in
544                                           general you want it set to 1)
545           output_to_core()           0   (will be MUCH faster)
546           tmp_to_core()              0   (will be MUCH faster)
547           use_inner_files()          0   (if tmp_to_core() is 0;
548                                           use 1 otherwise)
549
550       File I/O is much faster than in-core I/O.  Although it seems like
551       slurping a message into core and processing it in-core should be
552       faster... it isn't.  Reason: Perl's filehandle-based I/O translates
553       directly into native operating-system calls, whereas the in-core I/O is
554       implemented in Perl.
555
556       Inner files are slower than real tmpfiles, but faster than in-core
557       ones.  If speed is your concern, that's why you should set
558       use_inner_files(true) if you set tmp_to_core(true): so that we can
559       bypass the slow in-core tmpfiles if the input stream permits.
560
561       Native I/O is much faster than object-oriented I/O.  It's much faster
562       to use <$foo> than $foo->getline.  For backwards compatibilty, this
563       module must continue to use object-oriented I/O in most places, but if
564       you use parse() with a "real" filehandle (string, globref, or subclass
565       of IO::File) then MIME::Parser is able to perform some crucial
566       optimizations.
567
568       The parse_two() call is very inefficient.  Currently this is just a
569       front-end onto parse_data().  If your OS supports it, you're far better
570       off doing something like:
571
572           $parser->parse_open("/bin/cat msg.head msg.body |");
573
574   Minimizing memory
575       Optimum input mechanisms:
576
577           parse()                    YES
578           parse_open()               YES
579           parse_data()               NO  (in-core I/O will burn core)
580           parse_two()                NO  (in-core I/O will burn core)
581
582       Optimum settings:
583
584           decode_headers()           *** (no real difference)
585           extract_nested_messages()  *** (no real difference)
586           output_to_core()           0   (will use MUCH less memory)
587                                           tmp_to_core is 1)
588           tmp_to_core()              0   (will use MUCH less memory)
589           use_inner_files()          *** (no real difference, but set it to 1
590                                           if you *must* have tmp_to_core set to 1,
591                                           so that you avoid in-core tmpfiles)
592
593   Maximizing tolerance of bad MIME
594       Optimum input mechanisms:
595
596           parse()                    *** (doesn't matter)
597           parse_open()               *** (doesn't matter)
598           parse_data()               *** (doesn't matter)
599           parse_two()                *** (doesn't matter)
600
601       Optimum settings:
602
603           decode_headers()           0   (sidesteps problem of bad hdr encodings)
604           extract_nested_messages()  0   (sidesteps problems of bad nested messages,
605                                           but often you want it set to 1 anyway).
606           output_to_core()           *** (doesn't matter)
607           tmp_to_core()              *** (doesn't matter)
608           use_inner_files()          *** (doesn't matter)
609
610   Avoiding disk-based temporary files
611       Optimum input mechanisms:
612
613           parse()                    YES (if you give it a seekable handle)
614           parse_open()               YES (becomes a seekable handle)
615           parse_data()               NO  (unless you set tmp_to_core(1))
616           parse_two()                NO  (unless you set tmp_to_core(1))
617
618       Optimum settings:
619
620           decode_headers()           *** (doesn't matter)
621           extract_nested_messages()  *** (doesn't matter)
622           output_to_core()           *** (doesn't matter)
623           tmp_to_core()              1
624           use_inner_files()          1
625
626       If we can use them, inner files avoid most tmpfiles.  If you parse from
627       a seekable-and-tellable filehandle, then the internal
628       process_to_bound() doesn't need to extract each part into a temporary
629       buffer; it can use IO::InnerFile (warning: this will slow down the
630       parsing of messages with large attachments).
631
632       You can veto tmpfiles entirely.  If you might not be parsing from a
633       seekable-and-tellable filehandle, you can set tmp_to_core() true: this
634       will always use in-core I/O for the buffering (warning: this will slow
635       down the parsing of messages with large attachments).
636
637       Final resort.  You can always override new_tmpfile() in a subclass.
638

WARNINGS

640       Multipart messages are always read line-by-line
641           Multipart document parts are read line-by-line, so that the
642           encapsulation boundaries may easily be detected.  However, bad MIME
643           composition agents (for example, naive CGI scripts) might return
644           multipart documents where the parts are, say, unencoded bitmap
645           files... and, consequently, where such "lines" might be
646           veeeeeeeeery long indeed.
647
648           A better solution for this case would be to set up some form of
649           state machine for input processing.  This will be left for future
650           versions.
651
652       Multipart parts read into temp files before decoding
653           In my original implementation, the MIME::Decoder classes had to be
654           aware of encapsulation boundaries in multipart MIME documents.
655           While this decode-while-parsing approach obviated the need for
656           temporary files, it resulted in inflexible and complex decoder
657           implementations.
658
659           The revised implementation uses a temporary file (a la "tmpfile()")
660           during parsing to hold the encoded portion of the current MIME
661           document or part.  This file is deleted automatically after the
662           current part is decoded and the data is written to the "body
663           stream" object; you'll never see it, and should never need to worry
664           about it.
665
666           Some folks have asked for the ability to bypass this temp-file
667           mechanism, I suppose because they assume it would slow down their
668           application.  I considered accomodating this wish, but the temp-
669           file approach solves a lot of thorny problems in parsing, and it
670           also protects against hidden bugs in user applications (what if
671           you've directed the encoded part into a scalar, and someone
672           unexpectedly sends you a 6 MB tar file?).  Finally, I'm just not
673           conviced that the temp-file use adds significant overhead.
674
675       Fuzzing of CRLF and newline on input
676           RFC 2045 dictates that MIME streams have lines terminated by CRLF
677           ("\r\n").  However, it is extremely likely that folks will want to
678           parse MIME streams where each line ends in the local newline
679           character "\n" instead.
680
681           An attempt has been made to allow the parser to handle both CRLF
682           and newline-terminated input.
683
684       Fuzzing of CRLF and newline on output
685           The "7bit" and "8bit" decoders will decode both a "\n" and a "\r\n"
686           end-of-line sequence into a "\n".
687
688           The "binary" decoder (default if no encoding specified) still
689           outputs stuff verbatim... so a MIME message with CRLFs and no
690           explicit encoding will be output as a text file that, on many
691           systems, will have an annoying ^M at the end of each line... but
692           this is as it should be.
693
694       Inability to handle multipart boundaries that contain newlines
695           First, let's get something straight: this is an evil, EVIL
696           practice, and is incompatible with RFC 2046... hence, it's not
697           valid MIME.
698
699           If your mailer creates multipart boundary strings that contain
700           newlines when they appear in the message body, give it two weeks
701           notice and find another one.  If your mail robot receives MIME mail
702           like this, regard it as syntactically incorrect MIME, which it is.
703
704           Why do I say that?  Well, in RFC 2046, the syntax of a boundary is
705           given quite clearly:
706
707                 boundary := 0*69<bchars> bcharsnospace
708
709                 bchars := bcharsnospace / " "
710
711                 bcharsnospace :=    DIGIT / ALPHA / "'" / "(" / ")" / "+" /"_"
712                              / "," / "-" / "." / "/" / ":" / "=" / "?"
713
714           All of which means that a valid boundary string cannot have
715           newlines in it, and any newlines in such a string in the message
716           header are expected to be solely the result of folding the string
717           (i.e., inserting to-be-removed newlines for readability and line-
718           shortening only).
719
720           Yet, there is at least one brain-damaged user agent out there that
721           composes mail like this:
722
723                 MIME-Version: 1.0
724                 Content-type: multipart/mixed; boundary="----ABC-
725                  123----"
726                 Subject: Hi... I'm a dork!
727
728                 This is a multipart MIME message (yeah, right...)
729
730                 ----ABC-
731                  123----
732
733                 Hi there!
734
735           We have got to discourage practices like this (and the recent file
736           upload idiocy where binary files that are part of a multipart MIME
737           message aren't base64-encoded) if we want MIME to stay relatively
738           simple, and MIME parsers to be relatively robust.
739
740           Thanks to Andreas Koenig for bringing a baaaaaaaaad user agent to
741           my attention.
742

SEE ALSO

744       MIME::Tools, MIME::Head, MIME::Body, MIME::Entity, MIME::Decoder
745

AUTHOR

747       Eryq (eryq@zeegee.com), ZeeGee Software Inc (http://www.zeegee.com).
748       David F. Skoll (dfs@roaringpenguin.com) http://www.roaringpenguin.com
749
750       All rights reserved.  This program is free software; you can
751       redistribute it and/or modify it under the same terms as Perl itself.
752
753
754
755perl v5.10.1                      2008-06-30                   MIME::Parser(3)
Impressum