1MIME::Parser(3) User Contributed Perl Documentation MIME::Parser(3)
2
3
4
6 MIME::Parser - experimental class for parsing MIME streams
7
9 Before reading further, you should see MIME::Tools to make sure that
10 you understand where this module fits into the grand scheme of things.
11 Go on, do it now. I'll wait.
12
13 Ready? Ok...
14
15 Basic usage examples
16 ### Create a new parser object:
17 my $parser = new MIME::Parser;
18
19 ### Tell it where to put things:
20 $parser->output_under("/tmp");
21
22 ### Parse an input filehandle:
23 $entity = $parser->parse(\*STDIN);
24
25 ### Congratulations: you now have a (possibly multipart) MIME entity!
26 $entity->dump_skeleton; # for debugging
27
28 Examples of input
29 ### Parse from filehandles:
30 $entity = $parser->parse(\*STDIN);
31 $entity = $parser->parse(IO::File->new("some command|");
32
33 ### Parse from any object that supports getline() and read():
34 $entity = $parser->parse($myHandle);
35
36 ### Parse an in-core MIME message:
37 $entity = $parser->parse_data($message);
38
39 ### Parse an MIME message in a file:
40 $entity = $parser->parse_open("/some/file.msg");
41
42 ### Parse an MIME message out of a pipeline:
43 $entity = $parser->parse_open("gunzip - < file.msg.gz |");
44
45 ### Parse already-split input (as "deliver" would give it to you):
46 $entity = $parser->parse_two("msg.head", "msg.body");
47
48 Examples of output control
49 ### Keep parsed message bodies in core (default outputs to disk):
50 $parser->output_to_core(1);
51
52 ### Output each message body to a one-per-message directory:
53 $parser->output_under("/tmp");
54
55 ### Output each message body to the same directory:
56 $parser->output_dir("/tmp");
57
58 ### Change how nameless message-component files are named:
59 $parser->output_prefix("msg");
60
61 ### Put temporary files somewhere else
62 $parser->tmp_dir("/var/tmp/mytmpdir");
63
64 Examples of error recovery
65 ### Normal mechanism:
66 eval { $entity = $parser->parse(\*STDIN) };
67 if ($@) {
68 $results = $parser->results;
69 $decapitated = $parser->last_head; ### get last top-level head
70 }
71
72 ### Ultra-tolerant mechanism:
73 $parser->ignore_errors(1);
74 $entity = eval { $parser->parse(\*STDIN) };
75 $error = ($@ || $parser->last_error);
76
77 ### Cleanup all files created by the parse:
78 eval { $entity = $parser->parse(\*STDIN) };
79 ...
80 $parser->filer->purge;
81
82 Examples of parser options
83 ### Automatically attempt to RFC 2047-decode the MIME headers?
84 $parser->decode_headers(1); ### default is false
85
86 ### Parse contained "message/rfc822" objects as nested MIME streams?
87 $parser->extract_nested_messages(0); ### default is true
88
89 ### Look for uuencode in "text" messages, and extract it?
90 $parser->extract_uuencode(1); ### default is false
91
92 ### Should we forgive normally-fatal errors?
93 $parser->ignore_errors(0); ### default is true
94
95 Miscellaneous examples
96 ### Convert a Mail::Internet object to a MIME::Entity:
97 my $data = join('', (@{$mail->header}, "\n", @{$mail->body}));
98 $entity = $parser->parse_data(\$data);
99
101 You can inherit from this class to create your own subclasses that
102 parse MIME streams into MIME::Entity objects.
103
105 Construction
106 new ARGS...
107 Class method. Create a new parser object. Once you do this, you
108 can then set up various parameters before doing the actual parsing.
109 For example:
110
111 my $parser = new MIME::Parser;
112 $parser->output_dir("/tmp");
113 $parser->output_prefix("msg1");
114 my $entity = $parser->parse(\*STDIN);
115
116 Any arguments are passed into "init()". Don't override this in
117 your subclasses; override init() instead.
118
119 init ARGS...
120 Instance method. Initiallize a new MIME::Parser object. This is
121 automatically sent to a new object; you may want to override it.
122 If you override this, be sure to invoke the inherited method.
123
124 init_parse
125 Instance method. Invoked automatically whenever one of the top-
126 level parse() methods is called, to reset the parser to a "ready"
127 state.
128
129 Altering how messages are parsed
130 decode_headers [YESNO]
131 Instance method. Controls whether the parser will attempt to
132 decode all the MIME headers (as per RFC 2047) the moment it sees
133 them. This is not advisable for two very important reasons:
134
135 • It screws up the extraction of information from MIME fields.
136 If you fully decode the headers into bytes, you can
137 inadvertently transform a parseable MIME header like this:
138
139 Content-type: text/plain; filename="=?ISO-8859-1?Q?Hi=22Ho?="
140
141 into unparseable gobbledygook; in this case:
142
143 Content-type: text/plain; filename="Hi"Ho"
144
145 • It is information-lossy. An encoded string which contains both
146 Latin-1 and Cyrillic characters will be turned into a binary
147 mishmosh which simply can't be rendered.
148
149 History. This method was once the only out-of-the-box way to deal
150 with attachments whose filenames had non-ASCII characters.
151 However, since MIME-tools 5.4xx this is no longer necessary.
152
153 Parameters. If YESNO is true, decoding is done. However, you will
154 get a warning unless you use one of the special "true" values:
155
156 "I_NEED_TO_FIX_THIS"
157 Just shut up and do it. Not recommended.
158 Provided only for those who need to keep old scripts functioning.
159
160 "I_KNOW_WHAT_I_AM_DOING"
161 Just shut up and do it. Not recommended.
162 Provided for those who REALLY know what they are doing.
163
164 If YESNO is false (the default), no attempt at decoding will be
165 done. With no argument, just returns the current setting.
166 Remember: you can always decode the headers after the parsing has
167 completed (see MIME::Head::decode()), or decode the words on demand
168 (see MIME::Words).
169
170 extract_nested_messages OPTION
171 Instance method. Some MIME messages will contain a part of type
172 "message/rfc822" ,"message/partial" or "message/external-body":
173 literally, the text of an embedded mail/news/whatever message.
174 This option controls whether (and how) we parse that embedded
175 message.
176
177 If the OPTION is false, we treat such a message just as if it were
178 a "text/plain" document, without attempting to decode its contents.
179
180 If the OPTION is true (the default), the body of the
181 "message/rfc822" or "message/partial" part is parsed by this
182 parser, creating an entity object. What happens then is determined
183 by the actual OPTION:
184
185 NEST or 1
186 The default setting. The contained message becomes the sole
187 "part" of the "message/rfc822" entity (as if the containing
188 message were a special kind of "multipart" message). You can
189 recover the sub-entity by invoking the parts() method on the
190 "message/rfc822" entity.
191
192 REPLACE
193 The contained message replaces the "message/rfc822" entity, as
194 though the "message/rfc822" "container" never existed.
195
196 Warning: notice that, with this option, all the header
197 information in the "message/rfc822" header is lost. This might
198 seriously bother you if you're dealing with a top-level
199 message, and you've just lost the sender's address and the
200 subject line. ":-/".
201
202 Thanks to Andreas Koenig for suggesting this method.
203
204 extract_uuencode [YESNO]
205 Instance method. If set true, then whenever we are confronted with
206 a message whose effective content-type is "text/plain" and whose
207 encoding is 7bit/8bit/binary, we scan the encoded body to see if it
208 contains uuencoded data (generally given away by a "begin XXX"
209 line).
210
211 If it does, we explode the uuencoded message into a multipart,
212 where the text before the first "begin XXX" becomes the first part,
213 and all "begin...end" sections following become the subsequent
214 parts. The filename (if given) is accessible through the normal
215 means.
216
217 ignore_errors [YESNO]
218 Instance method. Controls whether the parser will attempt to
219 ignore normally-fatal errors, treating them as warnings and
220 continuing with the parse.
221
222 If YESNO is true (the default), many syntax errors are tolerated.
223 If YESNO is false, fatal errors throw exceptions. With no
224 argument, just returns the current setting.
225
226 decode_bodies [YESNO]
227 Instance method. Controls whether the parser should decode entity
228 bodies or not. If this is set to a false value (default is true),
229 all entity bodies will be kept as-is in the original content-
230 transfer encoding.
231
232 To prevent double encoding on the output side
233 MIME::Body->is_encoded is set, which tells MIME::Body not to encode
234 the data again, if encoded data was requested. This is in
235 particular useful, when it's important that the content must not be
236 modified, e.g. if you want to calculate OpenPGP signatures from it.
237
238 WARNING: the semantics change significantly if you parse MIME
239 messages with this option set, because MIME::Entity resp.
240 MIME::Body *always* see encoded data now, while the default
241 behaviour is working with *decoded* data (and encoding it only if
242 you request it). You need to decode the data yourself, if you want
243 to have it decoded.
244
245 So use this option only if you exactly know, what you're doing, and
246 that you're sure, that you really need it.
247
248 Parsing an input source
249 parse_data DATA
250 Instance method. Parse a MIME message that's already in core.
251 This internally creates an "in memory" filehandle on a Perl scalar
252 value using PerlIO
253
254 You may supply the DATA in any of a number of ways...
255
256 • A scalar which holds the message. A reference to this scalar
257 will be used internally.
258
259 • A ref to a scalar which holds the message. This reference will
260 be used internally.
261
262 • DEPRECATED
263
264 A ref to an array of scalars. The array is internally
265 concatenated into a temporary string, and a reference to the
266 new string is used internally.
267
268 It is much more efficient to pass in a scalar reference, so
269 please consider refactoring your code to use that interface
270 instead. If you absolutely MUST pass an array, you may be
271 better off using IO::ScalarArray in the calling code to
272 generate a filehandle, and passing that filehandle to parse()
273
274 Returns the parsed MIME::Entity on success.
275
276 parse INSTREAM
277 Instance method. Takes a MIME-stream and splits it into its
278 component entities.
279
280 The INSTREAM can be given as an IO::File, a globref filehandle
281 (like "\*STDIN"), or as any blessed object conforming to the IO::
282 interface (which minimally implements getline() and read()).
283
284 Returns the parsed MIME::Entity on success. Throws exception on
285 failure. If the message contained too many parts (as set by
286 max_parts), returns undef.
287
288 parse_open EXPR
289 Instance method. Convenience front-end onto "parse()". Simply
290 give this method any expression that may be sent as the second
291 argument to open() to open a filehandle for reading.
292
293 Returns the parsed MIME::Entity on success. Throws exception on
294 failure.
295
296 parse_two HEADFILE, BODYFILE
297 Instance method. Convenience front-end onto "parse_open()",
298 intended for programs running under mail-handlers like deliver,
299 which splits the incoming mail message into a header file and a
300 body file. Simply give this method the paths to the respective
301 files.
302
303 Warning: it is assumed that, once the files are cat'ed together,
304 there will be a blank line separating the head part and the body
305 part.
306
307 Warning: new implementation slurps files into line array for
308 portability, instead of using 'cat'. May be an issue if your
309 messages are large.
310
311 Returns the parsed MIME::Entity on success. Throws exception on
312 failure.
313
314 Specifying output destination
315 Warning: in 5.212 and before, this was done by methods of MIME::Parser.
316 However, since many users have requested fine-tuned control over how
317 this is done, the logic has been split off from the parser into its own
318 class, MIME::Parser::Filer Every MIME::Parser maintains an instance of
319 a MIME::Parser::Filer subclass to manage disk output (see
320 MIME::Parser::Filer for details.)
321
322 The benefit to this is that the MIME::Parser code won't be confounded
323 with a lot of garbage related to disk output. The drawback is that the
324 way you override the default behavior will change.
325
326 For now, all the normal public-interface methods are still provided,
327 but many are only stubs which create or delegate to the underlying
328 MIME::Parser::Filer object.
329
330 filer [FILER]
331 Instance method. Get/set the FILER object used to manage the
332 output of files to disk. This will be some subclass of
333 MIME::Parser::Filer.
334
335 output_dir DIRECTORY
336 Instance method. Causes messages to be filed directly into the
337 given DIRECTORY. It does this by setting the underlying filer() to
338 a new instance of MIME::Parser::FileInto, and passing the arguments
339 into that class' new() method.
340
341 Note: Since this method replaces the underlying filer, you must
342 invoke it before doing changing any attributes of the filer, like
343 the output prefix; otherwise those changes will be lost.
344
345 output_under BASEDIR, OPTS...
346 Instance method. Causes messages to be filed directly into
347 subdirectories of the given BASEDIR, one subdirectory per message.
348 It does this by setting the underlying filer() to a new instance of
349 MIME::Parser::FileUnder, and passing the arguments into that class'
350 new() method.
351
352 Note: Since this method replaces the underlying filer, you must
353 invoke it before doing changing any attributes of the filer, like
354 the output prefix; otherwise those changes will be lost.
355
356 output_path HEAD
357 Instance method, DEPRECATED. Given a MIME head for a file to be
358 extracted, come up with a good output pathname for the extracted
359 file. Identical to the preferred form:
360
361 $parser->filer->output_path(...args...);
362
363 We just delegate this to the underlying filer() object.
364
365 output_prefix [PREFIX]
366 Instance method, DEPRECATED. Get/set the short string that all
367 filenames for extracted body-parts will begin with (assuming that
368 there is no better "recommended filename"). Identical to the
369 preferred form:
370
371 $parser->filer->output_prefix(...args...);
372
373 We just delegate this to the underlying filer() object.
374
375 evil_filename NAME
376 Instance method, DEPRECATED. Identical to the preferred form:
377
378 $parser->filer->evil_filename(...args...);
379
380 We just delegate this to the underlying filer() object.
381
382 max_parts NUM
383 Instance method. Limits the number of MIME parts we will parse.
384
385 Normally, instances of this class parse a message to the bitter
386 end. Messages with many MIME parts can cause excessive memory
387 consumption. If you invoke this method, parsing will abort with a
388 die() if a message contains more than NUM parts.
389
390 If NUM is set to -1 (the default), then no maximum limit is
391 enforced.
392
393 With no argument, returns the current setting as an integer
394
395 output_to_core YESNO
396 Instance method. Normally, instances of this class output all
397 their decoded body data to disk files (via MIME::Body::File).
398 However, you can change this behaviour by invoking this method
399 before parsing:
400
401 If YESNO is false (the default), then all body data goes to disk
402 files.
403
404 If YESNO is true, then all body data goes to in-core data
405 structures This is a little risky (what if someone emails you an
406 MPEG or a tar file, hmmm?) but people seem to want this bit of
407 noose-shaped rope, so I'm providing it. Note that setting this
408 attribute true does not mean that parser-internal temporary files
409 are avoided! Use tmp_to_core() for that.
410
411 With no argument, returns the current setting as a boolean.
412
413 tmp_recycling
414 Instance method, DEPRECATED.
415
416 This method is a no-op to preserve the pre-5.421 API.
417
418 The tmp_recycling() feature was removed in 5.421 because it had
419 never actually worked. Please update your code to stop using it.
420
421 tmp_to_core [YESNO]
422 Instance method. Should new_tmpfile() create real temp files, or
423 use fake in-core ones? Normally we allow the creation of temporary
424 disk files, since this allows us to handle huge attachments even
425 when core is limited.
426
427 If YESNO is true, we implement new_tmpfile() via in-core handles.
428 If YESNO is false (the default), we use real tmpfiles. With no
429 argument, just returns the current setting.
430
431 use_inner_files [YESNO]
432 REMOVED.
433
434 Instance method.
435
436 MIME::Parser no longer supports IO::InnerFile, but this method is
437 retained for backwards compatibility. It does nothing.
438
439 The original reasoning for IO::InnerFile was that inner files were
440 faster than "in-core" temp files. At the time, the "in-core"
441 tempfile support was implemented with IO::Scalar from the IO-
442 Stringy distribution, which used the tie() interface to wrap a
443 scalar with the appropriate IO::Handle operations. The penalty for
444 this was fairly hefty, and IO::InnerFile actually was faster.
445
446 Nowadays, MIME::Parser uses Perl's built in ability to open a
447 filehandle on an in-memory scalar variable via PerlIO.
448 Benchmarking shows that IO::InnerFile is slightly slower than using
449 in-memory temporary files, and is slightly faster than on-disk
450 temporary files. Both measurements are within a few percent of
451 each other. Since there's no real benefit, and since the
452 IO::InnerFile abuse was fairly hairy and evil ("writes" to it were
453 faked by extending the size of the inner file with the assumption
454 that the only data you'd ever ->print() to it would be the line
455 from the "outer" file, for example) it's been removed.
456
457 Specifying classes to be instantiated
458 interface ROLE,[VALUE]
459 Instance method. During parsing, the parser normally creates
460 instances of certain classes, like MIME::Entity. However, you may
461 want to create a parser subclass that uses your own experimental
462 head, entity, etc. classes (for example, your "head" class may
463 provide some additional MIME-field-oriented methods).
464
465 If so, then this is the method that your subclass should invoke
466 during init. Use it like this:
467
468 package MyParser;
469 @ISA = qw(MIME::Parser);
470 ...
471 sub init {
472 my $self = shift;
473 $self->SUPER::init(@_); ### do my parent's init
474 $self->interface(ENTITY_CLASS => 'MIME::MyEntity');
475 $self->interface(HEAD_CLASS => 'MIME::MyHead');
476 $self; ### return
477 }
478
479 With no VALUE, returns the VALUE currently associated with that
480 ROLE.
481
482 new_body_for HEAD
483 Instance method. Based on the HEAD of a part we are parsing,
484 return a new body object (any desirable subclass of MIME::Body) for
485 receiving that part's data.
486
487 If you set the "output_to_core" option to false before parsing (the
488 default), then we call "output_path()" and create a new
489 MIME::Body::File on that filename.
490
491 If you set the "output_to_core" option to true before parsing, then
492 you get a MIME::Body::InCore instead.
493
494 If you want the parser to do something else entirely, you can
495 override this method in a subclass.
496
497 Temporary File Creation
498 tmp_dir DIRECTORY
499 Instance method. Causes any temporary files created by this parser
500 to be created in the given DIRECTORY.
501
502 If called without arguments, returns current value.
503
504 The default value is undef, which will cause new_tmpfile() to use
505 the system default temporary directory.
506
507 new_tmpfile
508 Instance method. Return an IO handle to be used to hold temporary
509 data during a parse.
510
511 The default uses MIME::Tools::tmpopen() to create a new temporary
512 file, unless tmp_to_core() dictates otherwise, but you can override
513 this. You shouldn't need to.
514
515 The location for temporary files can be changed on a per-parser
516 basis with tmp_dir().
517
518 If you do override this, make certain that the object you return is
519 set for binmode(), and is able to handle the following methods:
520
521 read(BUF, NBYTES)
522 getline()
523 getlines()
524 print(@ARGS)
525 flush()
526 seek(0, 0)
527
528 Fatal exception if the stream could not be established.
529
530 Parse results and error recovery
531 last_error
532 Instance method. Return the error (if any) that we ignored in the
533 last parse.
534
535 last_head
536 Instance method. Return the top-level MIME header of the last
537 stream we attempted to parse. This is useful for replying to
538 people who sent us bad MIME messages.
539
540 ### Parse an input stream:
541 eval { $entity = $parser->parse(\*STDIN) };
542 if (!$entity) { ### parse failed!
543 my $decapitated = $parser->last_head;
544 ...
545 }
546
547 results
548 Instance method. Return an object containing lots of info from the
549 last entity parsed. This will be an instance of class
550 MIME::Parser::Results.
551
553 Maximizing speed
554 Optimum input mechanisms:
555
556 parse() YES (if you give it a globref or a
557 subclass of IO::File)
558 parse_open() YES
559 parse_data() NO (see below)
560 parse_two() NO (see below)
561
562 Optimum settings:
563
564 decode_headers() *** (no real difference; 0 is slightly faster)
565 extract_nested_messages() 0 (may be slightly faster, but in
566 general you want it set to 1)
567 output_to_core() 0 (will be MUCH faster)
568 tmp_to_core() 0 (will be MUCH faster)
569
570 Native I/O is much faster than object-oriented I/O. It's much faster
571 to use <$foo> than $foo->getline. For backwards compatibility, this
572 module must continue to use object-oriented I/O in most places, but if
573 you use parse() with a "real" filehandle (string, globref, or subclass
574 of IO::File) then MIME::Parser is able to perform some crucial
575 optimizations.
576
577 The parse_two() call is very inefficient. Currently this is just a
578 front-end onto parse_data(). If your OS supports it, you're far better
579 off doing something like:
580
581 $parser->parse_open("/bin/cat msg.head msg.body |");
582
583 Minimizing memory
584 Optimum input mechanisms:
585
586 parse() YES
587 parse_open() YES
588 parse_data() NO (in-core I/O will burn core)
589 parse_two() NO (in-core I/O will burn core)
590
591 Optimum settings:
592
593 decode_headers() *** (no real difference)
594 extract_nested_messages() *** (no real difference)
595 output_to_core() 0 (will use MUCH less memory)
596 tmp_to_core is 1)
597 tmp_to_core() 0 (will use MUCH less memory)
598
599 Maximizing tolerance of bad MIME
600 Optimum input mechanisms:
601
602 parse() *** (doesn't matter)
603 parse_open() *** (doesn't matter)
604 parse_data() *** (doesn't matter)
605 parse_two() *** (doesn't matter)
606
607 Optimum settings:
608
609 decode_headers() 0 (sidesteps problem of bad hdr encodings)
610 extract_nested_messages() 0 (sidesteps problems of bad nested messages,
611 but often you want it set to 1 anyway).
612 output_to_core() *** (doesn't matter)
613 tmp_to_core() *** (doesn't matter)
614
615 Avoiding disk-based temporary files
616 Optimum input mechanisms:
617
618 parse() YES (if you give it a seekable handle)
619 parse_open() YES (becomes a seekable handle)
620 parse_data() NO (unless you set tmp_to_core(1))
621 parse_two() NO (unless you set tmp_to_core(1))
622
623 Optimum settings:
624
625 decode_headers() *** (doesn't matter)
626 extract_nested_messages() *** (doesn't matter)
627 output_to_core() *** (doesn't matter)
628 tmp_to_core() 1
629
630 You can veto tmpfiles entirely. You can set tmp_to_core() true: this
631 will always use in-core I/O for the buffering (warning: this will slow
632 down the parsing of messages with large attachments).
633
634 Final resort. You can always override new_tmpfile() in a subclass.
635
637 Multipart messages are always read line-by-line
638 Multipart document parts are read line-by-line, so that the
639 encapsulation boundaries may easily be detected. However, bad MIME
640 composition agents (for example, naive CGI scripts) might return
641 multipart documents where the parts are, say, unencoded bitmap
642 files... and, consequently, where such "lines" might be
643 veeeeeeeeery long indeed.
644
645 A better solution for this case would be to set up some form of
646 state machine for input processing. This will be left for future
647 versions.
648
649 Multipart parts read into temp files before decoding
650 In my original implementation, the MIME::Decoder classes had to be
651 aware of encapsulation boundaries in multipart MIME documents.
652 While this decode-while-parsing approach obviated the need for
653 temporary files, it resulted in inflexible and complex decoder
654 implementations.
655
656 The revised implementation uses a temporary file (a la "tmpfile()")
657 during parsing to hold the encoded portion of the current MIME
658 document or part. This file is deleted automatically after the
659 current part is decoded and the data is written to the "body
660 stream" object; you'll never see it, and should never need to worry
661 about it.
662
663 Some folks have asked for the ability to bypass this temp-file
664 mechanism, I suppose because they assume it would slow down their
665 application. I considered accommodating this wish, but the temp-
666 file approach solves a lot of thorny problems in parsing, and it
667 also protects against hidden bugs in user applications (what if
668 you've directed the encoded part into a scalar, and someone
669 unexpectedly sends you a 6 MB tar file?). Finally, I'm just not
670 convinced that the temp-file use adds significant overhead.
671
672 Fuzzing of CRLF and newline on input
673 RFC 2045 dictates that MIME streams have lines terminated by CRLF
674 ("\r\n"). However, it is extremely likely that folks will want to
675 parse MIME streams where each line ends in the local newline
676 character "\n" instead.
677
678 An attempt has been made to allow the parser to handle both CRLF
679 and newline-terminated input.
680
681 Fuzzing of CRLF and newline on output
682 The "7bit" and "8bit" decoders will decode both a "\n" and a "\r\n"
683 end-of-line sequence into a "\n".
684
685 The "binary" decoder (default if no encoding specified) still
686 outputs stuff verbatim... so a MIME message with CRLFs and no
687 explicit encoding will be output as a text file that, on many
688 systems, will have an annoying ^M at the end of each line... but
689 this is as it should be.
690
691 Inability to handle multipart boundaries that contain newlines
692 First, let's get something straight: this is an evil, EVIL
693 practice, and is incompatible with RFC 2046... hence, it's not
694 valid MIME.
695
696 If your mailer creates multipart boundary strings that contain
697 newlines when they appear in the message body, give it two weeks
698 notice and find another one. If your mail robot receives MIME mail
699 like this, regard it as syntactically incorrect MIME, which it is.
700
701 Why do I say that? Well, in RFC 2046, the syntax of a boundary is
702 given quite clearly:
703
704 boundary := 0*69<bchars> bcharsnospace
705
706 bchars := bcharsnospace / " "
707
708 bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" / "+" /"_"
709 / "," / "-" / "." / "/" / ":" / "=" / "?"
710
711 All of which means that a valid boundary string cannot have
712 newlines in it, and any newlines in such a string in the message
713 header are expected to be solely the result of folding the string
714 (i.e., inserting to-be-removed newlines for readability and line-
715 shortening only).
716
717 Yet, there is at least one brain-damaged user agent out there that
718 composes mail like this:
719
720 MIME-Version: 1.0
721 Content-type: multipart/mixed; boundary="----ABC-
722 123----"
723 Subject: Hi... I'm a dork!
724
725 This is a multipart MIME message (yeah, right...)
726
727 ----ABC-
728 123----
729
730 Hi there!
731
732 We have got to discourage practices like this (and the recent file
733 upload idiocy where binary files that are part of a multipart MIME
734 message aren't base64-encoded) if we want MIME to stay relatively
735 simple, and MIME parsers to be relatively robust.
736
737 Thanks to Andreas Koenig for bringing a baaaaaaaaad user agent to
738 my attention.
739
741 MIME::Tools, MIME::Head, MIME::Body, MIME::Entity, MIME::Decoder
742
744 Eryq (eryq@zeegee.com), ZeeGee Software Inc (http://www.zeegee.com).
745 Dianne Skoll (dfs@roaringpenguin.com) http://www.roaringpenguin.com
746
747 All rights reserved. This program is free software; you can
748 redistribute it and/or modify it under the same terms as Perl itself.
749
750
751
752perl v5.34.0 2022-01-21 MIME::Parser(3)