1MIME::Parser(3) User Contributed Perl Documentation MIME::Parser(3)
2
3
4
6 MIME::Parser - experimental class for parsing MIME streams
7
9 Before reading further, you should see MIME::Tools to make sure that
10 you understand where this module fits into the grand scheme of things.
11 Go on, do it now. I'll wait.
12
13 Ready? Ok...
14
15 Basic usage examples
16 ### Create a new parser object:
17 my $parser = new MIME::Parser;
18
19 ### Tell it where to put things:
20 $parser->output_under("/tmp");
21
22 ### Parse an input filehandle:
23 $entity = $parser->parse(\*STDIN);
24
25 ### Congratulations: you now have a (possibly multipart) MIME entity!
26 $entity->dump_skeleton; # for debugging
27
28 Examples of input
29 ### Parse from filehandles:
30 $entity = $parser->parse(\*STDIN);
31 $entity = $parser->parse(IO::File->new("some command|");
32
33 ### Parse from any object that supports getline() and read():
34 $entity = $parser->parse($myHandle);
35
36 ### Parse an in-core MIME message:
37 $entity = $parser->parse_data($message);
38
39 ### Parse an MIME message in a file:
40 $entity = $parser->parse_open("/some/file.msg");
41
42 ### Parse an MIME message out of a pipeline:
43 $entity = $parser->parse_open("gunzip - < file.msg.gz |");
44
45 ### Parse already-split input (as "deliver" would give it to you):
46 $entity = $parser->parse_two("msg.head", "msg.body");
47
48 Examples of output control
49 ### Keep parsed message bodies in core (default outputs to disk):
50 $parser->output_to_core(1);
51
52 ### Output each message body to a one-per-message directory:
53 $parser->output_under("/tmp");
54
55 ### Output each message body to the same directory:
56 $parser->output_dir("/tmp");
57
58 ### Change how nameless message-component files are named:
59 $parser->output_prefix("msg");
60
61 ### Put temporary files somewhere else
62 $parser->tmp_dir("/var/tmp/mytmpdir");
63
64 Examples of error recovery
65 ### Normal mechanism:
66 eval { $entity = $parser->parse(\*STDIN) };
67 if ($@) {
68 $results = $parser->results;
69 $decapitated = $parser->last_head; ### get last top-level head
70 }
71
72 ### Ultra-tolerant mechanism:
73 $parser->ignore_errors(1);
74 $entity = eval { $parser->parse(\*STDIN) };
75 $error = ($@ || $parser->last_error);
76
77 ### Cleanup all files created by the parse:
78 eval { $entity = $parser->parse(\*STDIN) };
79 ...
80 $parser->filer->purge;
81
82 Examples of parser options
83 ### Automatically attempt to RFC 2047-decode the MIME headers?
84 $parser->decode_headers(1); ### default is false
85
86 ### Parse contained "message/rfc822" objects as nested MIME streams?
87 $parser->extract_nested_messages(0); ### default is true
88
89 ### Look for uuencode in "text" messages, and extract it?
90 $parser->extract_uuencode(1); ### default is false
91
92 ### Should we forgive normally-fatal errors?
93 $parser->ignore_errors(0); ### default is true
94
95 Miscellaneous examples
96 ### Convert a Mail::Internet object to a MIME::Entity:
97 @lines = (@{$mail->header}, "\n", @{$mail->body});
98 $entity = $parser->parse_data(\@lines);
99
101 You can inherit from this class to create your own subclasses that
102 parse MIME streams into MIME::Entity objects.
103
105 Construction
106 new ARGS...
107 Class method. Create a new parser object. Once you do this, you
108 can then set up various parameters before doing the actual parsing.
109 For example:
110
111 my $parser = new MIME::Parser;
112 $parser->output_dir("/tmp");
113 $parser->output_prefix("msg1");
114 my $entity = $parser->parse(\*STDIN);
115
116 Any arguments are passed into "init()". Don't override this in
117 your subclasses; override init() instead.
118
119 init ARGS...
120 Instance method. Initiallize a new MIME::Parser object. This is
121 automatically sent to a new object; you may want to override it.
122 If you override this, be sure to invoke the inherited method.
123
124 init_parse
125 Instance method. Invoked automatically whenever one of the top-
126 level parse() methods is called, to reset the parser to a "ready"
127 state.
128
129 Altering how messages are parsed
130 decode_headers [YESNO]
131 Instance method. Controls whether the parser will attempt to
132 decode all the MIME headers (as per RFC 2047) the moment it sees
133 them. This is not advisable for two very important reasons:
134
135 · It screws up the extraction of information from MIME fields.
136 If you fully decode the headers into bytes, you can
137 inadvertently transform a parseable MIME header like this:
138
139 Content-type: text/plain; filename="=?ISO-8859-1?Q?Hi=22Ho?="
140
141 into unparseable gobbledygook; in this case:
142
143 Content-type: text/plain; filename="Hi"Ho"
144
145 · It is information-lossy. An encoded string which contains both
146 Latin-1 and Cyrillic characters will be turned into a binary
147 mishmosh which simply can't be rendered.
148
149 History. This method was once the only out-of-the-box way to deal
150 with attachments whose filenames had non-ASCII characters.
151 However, since MIME-tools 5.4xx this is no longer necessary.
152
153 Parameters. If YESNO is true, decoding is done. However, you will
154 get a warning unless you use one of the special "true" values:
155
156 "I_NEED_TO_FIX_THIS"
157 Just shut up and do it. Not recommended.
158 Provided only for those who need to keep old scripts functioning.
159
160 "I_KNOW_WHAT_I_AM_DOING"
161 Just shut up and do it. Not recommended.
162 Provided for those who REALLY know what they are doing.
163
164 If YESNO is false (the default), no attempt at decoding will be
165 done. With no argument, just returns the current setting.
166 Remember: you can always decode the headers after the parsing has
167 completed (see MIME::Head::decode()), or decode the words on demand
168 (see MIME::Words).
169
170 extract_nested_messages OPTION
171 Instance method. Some MIME messages will contain a part of type
172 "message/rfc822" ,"message/partial" or "message/external-body":
173 literally, the text of an embedded mail/news/whatever message.
174 This option controls whether (and how) we parse that embedded
175 message.
176
177 If the OPTION is false, we treat such a message just as if it were
178 a "text/plain" document, without attempting to decode its contents.
179
180 If the OPTION is true (the default), the body of the
181 "message/rfc822" or "message/partial" part is parsed by this
182 parser, creating an entity object. What happens then is determined
183 by the actual OPTION:
184
185 NEST or 1
186 The default setting. The contained message becomes the sole
187 "part" of the "message/rfc822" entity (as if the containing
188 message were a special kind of "multipart" message). You can
189 recover the sub-entity by invoking the parts() method on the
190 "message/rfc822" entity.
191
192 REPLACE
193 The contained message replaces the "message/rfc822" entity, as
194 though the "message/rfc822" "container" never existed.
195
196 Warning: notice that, with this option, all the header
197 information in the "message/rfc822" header is lost. This might
198 seriously bother you if you're dealing with a top-level
199 message, and you've just lost the sender's address and the
200 subject line. ":-/".
201
202 Thanks to Andreas Koenig for suggesting this method.
203
204 extract_uuencode [YESNO]
205 Instance method. If set true, then whenever we are confronted with
206 a message whose effective content-type is "text/plain" and whose
207 encoding is 7bit/8bit/binary, we scan the encoded body to see if it
208 contains uuencoded data (generally given away by a "begin XXX"
209 line).
210
211 If it does, we explode the uuencoded message into a multipart,
212 where the text before the first "begin XXX" becomes the first part,
213 and all "begin...end" sections following become the subsequent
214 parts. The filename (if given) is accessible through the normal
215 means.
216
217 ignore_errors [YESNO]
218 Instance method. Controls whether the parser will attempt to
219 ignore normally-fatal errors, treating them as warnings and
220 continuing with the parse.
221
222 If YESNO is true (the default), many syntax errors are tolerated.
223 If YESNO is false, fatal errors throw exceptions. With no
224 argument, just returns the current setting.
225
226 decode_bodies [YESNO]
227 Instance method. Controls whether the parser should decode entity
228 bodies or not. If this is set to a false value (default is true),
229 all entity bodies will be kept as-is in the original content-
230 transfer encoding.
231
232 To prevent double encoding on the output side
233 MIME::Body->is_encoded is set, which tells MIME::Body not to encode
234 the data again, if encoded data was requested. This is in
235 particular useful, when it's important that the content must not be
236 modified, e.g. if you want to calculate OpenPGP signatures from it.
237
238 WARNING: the semantics change significantly if you parse MIME
239 messages with this option set, because MIME::Entity resp.
240 MIME::Body *always* see encoded data now, while the default
241 behaviour is working with *decoded* data (and encoding it only if
242 you request it). You need to decode the data yourself, if you want
243 to have it decoded.
244
245 So use this option only if you exactly know, what you're doing, and
246 that you're sure, that you really need it.
247
248 Parsing an input source
249 parse_data DATA
250 Instance method. Parse a MIME message that's already in core. You
251 may supply the DATA in any of a number of ways...
252
253 · A scalar which holds the message.
254
255 · A ref to a scalar which holds the message. This is an
256 efficiency hack.
257
258 · A ref to an array of scalars. They are treated as a stream
259 which (conceptually) consists of simply concatenating the
260 scalars.
261
262 Returns the parsed MIME::Entity on success.
263
264 parse INSTREAM
265 Instance method. Takes a MIME-stream and splits it into its
266 component entities.
267
268 The INSTREAM can be given as an IO::File, a globref filehandle
269 (like "\*STDIN"), or as any blessed object conforming to the IO::
270 interface (which minimally implements getline() and read()).
271
272 Returns the parsed MIME::Entity on success. Throws exception on
273 failure. If the message contained too many parts (as set by
274 max_parts), returns undef.
275
276 parse_open EXPR
277 Instance method. Convenience front-end onto "parse()". Simply
278 give this method any expression that may be sent as the second
279 argument to open() to open a filehandle for reading.
280
281 Returns the parsed MIME::Entity on success. Throws exception on
282 failure.
283
284 parse_two HEADFILE, BODYFILE
285 Instance method. Convenience front-end onto "parse_open()",
286 intended for programs running under mail-handlers like deliver,
287 which splits the incoming mail message into a header file and a
288 body file. Simply give this method the paths to the respective
289 files.
290
291 Warning: it is assumed that, once the files are cat'ed together,
292 there will be a blank line separating the head part and the body
293 part.
294
295 Warning: new implementation slurps files into line array for
296 portability, instead of using 'cat'. May be an issue if your
297 messages are large.
298
299 Returns the parsed MIME::Entity on success. Throws exception on
300 failure.
301
302 Specifying output destination
303 Warning: in 5.212 and before, this was done by methods of MIME::Parser.
304 However, since many users have requested fine-tuned control over how
305 this is done, the logic has been split off from the parser into its own
306 class, MIME::Parser::Filer Every MIME::Parser maintains an instance of
307 a MIME::Parser::Filer subclass to manage disk output (see
308 MIME::Parser::Filer for details.)
309
310 The benefit to this is that the MIME::Parser code won't be confounded
311 with a lot of garbage related to disk output. The drawback is that the
312 way you override the default behavior will change.
313
314 For now, all the normal public-interface methods are still provided,
315 but many are only stubs which create or delegate to the underlying
316 MIME::Parser::Filer object.
317
318 filer [FILER]
319 Instance method. Get/set the FILER object used to manage the
320 output of files to disk. This will be some subclass of
321 MIME::Parser::Filer.
322
323 output_dir DIRECTORY
324 Instance method. Causes messages to be filed directly into the
325 given DIRECTORY. It does this by setting the underlying filer() to
326 a new instance of MIME::Parser::FileInto, and passing the arguments
327 into that class' new() method.
328
329 Note: Since this method replaces the underlying filer, you must
330 invoke it before doing changing any attributes of the filer, like
331 the output prefix; otherwise those changes will be lost.
332
333 output_under BASEDIR, OPTS...
334 Instance method. Causes messages to be filed directly into
335 subdirectories of the given BASEDIR, one subdirectory per message.
336 It does this by setting the underlying filer() to a new instance of
337 MIME::Parser::FileUnder, and passing the arguments into that class'
338 new() method.
339
340 Note: Since this method replaces the underlying filer, you must
341 invoke it before doing changing any attributes of the filer, like
342 the output prefix; otherwise those changes will be lost.
343
344 output_path HEAD
345 Instance method, DEPRECATED. Given a MIME head for a file to be
346 extracted, come up with a good output pathname for the extracted
347 file. Identical to the preferred form:
348
349 $parser->filer->output_path(...args...);
350
351 We just delegate this to the underlying filer() object.
352
353 output_prefix [PREFIX]
354 Instance method, DEPRECATED. Get/set the short string that all
355 filenames for extracted body-parts will begin with (assuming that
356 there is no better "recommended filename"). Identical to the
357 preferred form:
358
359 $parser->filer->output_prefix(...args...);
360
361 We just delegate this to the underlying filer() object.
362
363 evil_filename NAME
364 Instance method, DEPRECATED. Identical to the preferred form:
365
366 $parser->filer->evil_filename(...args...);
367
368 We just delegate this to the underlying filer() object.
369
370 max_parts NUM
371 Instance method. Limits the number of MIME parts we will parse.
372
373 Normally, instances of this class parse a message to the bitter
374 end. Messages with many MIME parts can cause excessive memory
375 consumption. If you invoke this method, parsing will abort with a
376 die() if a message contains more than NUM parts.
377
378 If NUM is set to -1 (the default), then no maximum limit is
379 enforced.
380
381 With no argument, returns the current setting as an integer
382
383 output_to_core YESNO
384 Instance method. Normally, instances of this class output all
385 their decoded body data to disk files (via MIME::Body::File).
386 However, you can change this behaviour by invoking this method
387 before parsing:
388
389 If YESNO is false (the default), then all body data goes to disk
390 files.
391
392 If YESNO is true, then all body data goes to in-core data
393 structures This is a little risky (what if someone emails you an
394 MPEG or a tar file, hmmm?) but people seem to want this bit of
395 noose-shaped rope, so I'm providing it. Note that setting this
396 attribute true does not mean that parser-internal temporary files
397 are avoided! Use tmp_to_core() for that.
398
399 With no argument, returns the current setting as a boolean.
400
401 tmp_recycling
402 Instance method, DEPRECATED.
403
404 This method is a no-op to preserve the pre-5.421 API.
405
406 The tmp_recycling() feature was removed in 5.421 because it had
407 never actually worked. Please update your code to stop using it.
408
409 tmp_to_core [YESNO]
410 Instance method. Should new_tmpfile() create real temp files, or
411 use fake in-core ones? Normally we allow the creation of temporary
412 disk files, since this allows us to handle huge attachments even
413 when core is limited.
414
415 If YESNO is true, we implement new_tmpfile() via in-core handles.
416 If YESNO is false (the default), we use real tmpfiles. With no
417 argument, just returns the current setting.
418
419 use_inner_files [YESNO]
420 Instance method. If you are parsing from a handle which supports
421 seek() and tell(), then we can avoid tmpfiles completely by using
422 IO::InnerFile, if so desired: basically, we simulate a temporary
423 file via pointers to virtual start- and end-positions in the input
424 stream.
425
426 If YESNO is false (the default), then we will not use
427 IO::InnerFile. If YESNO is true, we use IO::InnerFile if we can.
428 With no argument, just returns the current setting.
429
430 Note: inner files are slower than real tmpfiles, but possibly
431 faster than in-core tmpfiles... so your choice for this option will
432 probably depend on your choice for tmp_to_core() and the kind of
433 input streams you are parsing.
434
435 Specifying classes to be instantiated
436 interface ROLE,[VALUE]
437 Instance method. During parsing, the parser normally creates
438 instances of certain classes, like MIME::Entity. However, you may
439 want to create a parser subclass that uses your own experimental
440 head, entity, etc. classes (for example, your "head" class may
441 provide some additional MIME-field-oriented methods).
442
443 If so, then this is the method that your subclass should invoke
444 during init. Use it like this:
445
446 package MyParser;
447 @ISA = qw(MIME::Parser);
448 ...
449 sub init {
450 my $self = shift;
451 $self->SUPER::init(@_); ### do my parent's init
452 $self->interface(ENTITY_CLASS => 'MIME::MyEntity');
453 $self->interface(HEAD_CLASS => 'MIME::MyHead');
454 $self; ### return
455 }
456
457 With no VALUE, returns the VALUE currently associated with that
458 ROLE.
459
460 new_body_for HEAD
461 Instance method. Based on the HEAD of a part we are parsing,
462 return a new body object (any desirable subclass of MIME::Body) for
463 receiving that part's data.
464
465 If you set the "output_to_core" option to false before parsing (the
466 default), then we call "output_path()" and create a new
467 MIME::Body::File on that filename.
468
469 If you set the "output_to_core" option to true before parsing, then
470 you get a MIME::Body::InCore instead.
471
472 If you want the parser to do something else entirely, you can
473 override this method in a subclass.
474
475 Temporary File Creation
476 tmp_dir DIRECTORY
477 Instance method. Causes any temporary files created by this parser
478 to be created in the given DIRECTORY.
479
480 If called without arguments, returns current value.
481
482 The default value is undef, which will cause new_tmpfile() to use
483 the system default temporary directory.
484
485 new_tmpfile
486 Instance method. Return an IO handle to be used to hold temporary
487 data during a parse.
488
489 The default uses MIME::Tools::tmpopen() to create a new temporary
490 file, unless tmp_to_core() dictates otherwise, but you can override
491 this. You shouldn't need to.
492
493 The location for temporary files can be changed on a per-parser
494 basis with tmp_dir().
495
496 If you do override this, make certain that the object you return is
497 set for binmode(), and is able to handle the following methods:
498
499 read(BUF, NBYTES)
500 getline()
501 getlines()
502 print(@ARGS)
503 flush()
504 seek(0, 0)
505
506 Fatal exception if the stream could not be established.
507
508 Parse results and error recovery
509 last_error
510 Instance method. Return the error (if any) that we ignored in the
511 last parse.
512
513 last_head
514 Instance method. Return the top-level MIME header of the last
515 stream we attempted to parse. This is useful for replying to
516 people who sent us bad MIME messages.
517
518 ### Parse an input stream:
519 eval { $entity = $parser->parse(\*STDIN) };
520 if (!$entity) { ### parse failed!
521 my $decapitated = $parser->last_head;
522 ...
523 }
524
525 results
526 Instance method. Return an object containing lots of info from the
527 last entity parsed. This will be an instance of class
528 MIME::Parser::Results.
529
531 Maximizing speed
532 Optimum input mechanisms:
533
534 parse() YES (if you give it a globref or a
535 subclass of IO::File)
536 parse_open() YES
537 parse_data() NO (see below)
538 parse_two() NO (see below)
539
540 Optimum settings:
541
542 decode_headers() *** (no real difference; 0 is slightly faster)
543 extract_nested_messages() 0 (may be slightly faster, but in
544 general you want it set to 1)
545 output_to_core() 0 (will be MUCH faster)
546 tmp_to_core() 0 (will be MUCH faster)
547 use_inner_files() 0 (if tmp_to_core() is 0;
548 use 1 otherwise)
549
550 File I/O is much faster than in-core I/O. Although it seems like
551 slurping a message into core and processing it in-core should be
552 faster... it isn't. Reason: Perl's filehandle-based I/O translates
553 directly into native operating-system calls, whereas the in-core I/O is
554 implemented in Perl.
555
556 Inner files are slower than real tmpfiles, but faster than in-core
557 ones. If speed is your concern, that's why you should set
558 use_inner_files(true) if you set tmp_to_core(true): so that we can
559 bypass the slow in-core tmpfiles if the input stream permits.
560
561 Native I/O is much faster than object-oriented I/O. It's much faster
562 to use <$foo> than $foo->getline. For backwards compatibilty, this
563 module must continue to use object-oriented I/O in most places, but if
564 you use parse() with a "real" filehandle (string, globref, or subclass
565 of IO::File) then MIME::Parser is able to perform some crucial
566 optimizations.
567
568 The parse_two() call is very inefficient. Currently this is just a
569 front-end onto parse_data(). If your OS supports it, you're far better
570 off doing something like:
571
572 $parser->parse_open("/bin/cat msg.head msg.body |");
573
574 Minimizing memory
575 Optimum input mechanisms:
576
577 parse() YES
578 parse_open() YES
579 parse_data() NO (in-core I/O will burn core)
580 parse_two() NO (in-core I/O will burn core)
581
582 Optimum settings:
583
584 decode_headers() *** (no real difference)
585 extract_nested_messages() *** (no real difference)
586 output_to_core() 0 (will use MUCH less memory)
587 tmp_to_core is 1)
588 tmp_to_core() 0 (will use MUCH less memory)
589 use_inner_files() *** (no real difference, but set it to 1
590 if you *must* have tmp_to_core set to 1,
591 so that you avoid in-core tmpfiles)
592
593 Maximizing tolerance of bad MIME
594 Optimum input mechanisms:
595
596 parse() *** (doesn't matter)
597 parse_open() *** (doesn't matter)
598 parse_data() *** (doesn't matter)
599 parse_two() *** (doesn't matter)
600
601 Optimum settings:
602
603 decode_headers() 0 (sidesteps problem of bad hdr encodings)
604 extract_nested_messages() 0 (sidesteps problems of bad nested messages,
605 but often you want it set to 1 anyway).
606 output_to_core() *** (doesn't matter)
607 tmp_to_core() *** (doesn't matter)
608 use_inner_files() *** (doesn't matter)
609
610 Avoiding disk-based temporary files
611 Optimum input mechanisms:
612
613 parse() YES (if you give it a seekable handle)
614 parse_open() YES (becomes a seekable handle)
615 parse_data() NO (unless you set tmp_to_core(1))
616 parse_two() NO (unless you set tmp_to_core(1))
617
618 Optimum settings:
619
620 decode_headers() *** (doesn't matter)
621 extract_nested_messages() *** (doesn't matter)
622 output_to_core() *** (doesn't matter)
623 tmp_to_core() 1
624 use_inner_files() 1
625
626 If we can use them, inner files avoid most tmpfiles. If you parse from
627 a seekable-and-tellable filehandle, then the internal
628 process_to_bound() doesn't need to extract each part into a temporary
629 buffer; it can use IO::InnerFile (warning: this will slow down the
630 parsing of messages with large attachments).
631
632 You can veto tmpfiles entirely. If you might not be parsing from a
633 seekable-and-tellable filehandle, you can set tmp_to_core() true: this
634 will always use in-core I/O for the buffering (warning: this will slow
635 down the parsing of messages with large attachments).
636
637 Final resort. You can always override new_tmpfile() in a subclass.
638
640 Multipart messages are always read line-by-line
641 Multipart document parts are read line-by-line, so that the
642 encapsulation boundaries may easily be detected. However, bad MIME
643 composition agents (for example, naive CGI scripts) might return
644 multipart documents where the parts are, say, unencoded bitmap
645 files... and, consequently, where such "lines" might be
646 veeeeeeeeery long indeed.
647
648 A better solution for this case would be to set up some form of
649 state machine for input processing. This will be left for future
650 versions.
651
652 Multipart parts read into temp files before decoding
653 In my original implementation, the MIME::Decoder classes had to be
654 aware of encapsulation boundaries in multipart MIME documents.
655 While this decode-while-parsing approach obviated the need for
656 temporary files, it resulted in inflexible and complex decoder
657 implementations.
658
659 The revised implementation uses a temporary file (a la "tmpfile()")
660 during parsing to hold the encoded portion of the current MIME
661 document or part. This file is deleted automatically after the
662 current part is decoded and the data is written to the "body
663 stream" object; you'll never see it, and should never need to worry
664 about it.
665
666 Some folks have asked for the ability to bypass this temp-file
667 mechanism, I suppose because they assume it would slow down their
668 application. I considered accomodating this wish, but the temp-
669 file approach solves a lot of thorny problems in parsing, and it
670 also protects against hidden bugs in user applications (what if
671 you've directed the encoded part into a scalar, and someone
672 unexpectedly sends you a 6 MB tar file?). Finally, I'm just not
673 conviced that the temp-file use adds significant overhead.
674
675 Fuzzing of CRLF and newline on input
676 RFC 2045 dictates that MIME streams have lines terminated by CRLF
677 ("\r\n"). However, it is extremely likely that folks will want to
678 parse MIME streams where each line ends in the local newline
679 character "\n" instead.
680
681 An attempt has been made to allow the parser to handle both CRLF
682 and newline-terminated input.
683
684 Fuzzing of CRLF and newline on output
685 The "7bit" and "8bit" decoders will decode both a "\n" and a "\r\n"
686 end-of-line sequence into a "\n".
687
688 The "binary" decoder (default if no encoding specified) still
689 outputs stuff verbatim... so a MIME message with CRLFs and no
690 explicit encoding will be output as a text file that, on many
691 systems, will have an annoying ^M at the end of each line... but
692 this is as it should be.
693
694 Inability to handle multipart boundaries that contain newlines
695 First, let's get something straight: this is an evil, EVIL
696 practice, and is incompatible with RFC 2046... hence, it's not
697 valid MIME.
698
699 If your mailer creates multipart boundary strings that contain
700 newlines when they appear in the message body, give it two weeks
701 notice and find another one. If your mail robot receives MIME mail
702 like this, regard it as syntactically incorrect MIME, which it is.
703
704 Why do I say that? Well, in RFC 2046, the syntax of a boundary is
705 given quite clearly:
706
707 boundary := 0*69<bchars> bcharsnospace
708
709 bchars := bcharsnospace / " "
710
711 bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" / "+" /"_"
712 / "," / "-" / "." / "/" / ":" / "=" / "?"
713
714 All of which means that a valid boundary string cannot have
715 newlines in it, and any newlines in such a string in the message
716 header are expected to be solely the result of folding the string
717 (i.e., inserting to-be-removed newlines for readability and line-
718 shortening only).
719
720 Yet, there is at least one brain-damaged user agent out there that
721 composes mail like this:
722
723 MIME-Version: 1.0
724 Content-type: multipart/mixed; boundary="----ABC-
725 123----"
726 Subject: Hi... I'm a dork!
727
728 This is a multipart MIME message (yeah, right...)
729
730 ----ABC-
731 123----
732
733 Hi there!
734
735 We have got to discourage practices like this (and the recent file
736 upload idiocy where binary files that are part of a multipart MIME
737 message aren't base64-encoded) if we want MIME to stay relatively
738 simple, and MIME parsers to be relatively robust.
739
740 Thanks to Andreas Koenig for bringing a baaaaaaaaad user agent to
741 my attention.
742
744 MIME::Tools, MIME::Head, MIME::Body, MIME::Entity, MIME::Decoder
745
747 Eryq (eryq@zeegee.com), ZeeGee Software Inc (http://www.zeegee.com).
748 David F. Skoll (dfs@roaringpenguin.com) http://www.roaringpenguin.com
749
750 All rights reserved. This program is free software; you can
751 redistribute it and/or modify it under the same terms as Perl itself.
752
753
754
755perl v5.12.0 2010-04-22 MIME::Parser(3)