MIME::Head(3pm)

1MIME::Head(3)         User Contributed Perl Documentation        MIME::Head(3)
2
3
4

NAME

6       MIME::Head - MIME message header (a subclass of Mail::Header)
7

SYNOPSIS

9       Before reading further, you should see MIME::Tools to make sure that
10       you understand where this module fits into the grand scheme of things.
11       Go on, do it now.  I'll wait.
12
13       Ready?  Ok...
14
15       Construction
16
17           ### Create a new, empty header, and populate it manually:
18           $head = MIME::Head->new;
19           $head->replace('content-type', 'text/plain; charset=US-ASCII');
20           $head->replace('content-length', $len);
21
22           ### Parse a new header from a filehandle:
23           $head = MIME::Head->read(\*STDIN);
24
25           ### Parse a new header from a file, or a readable pipe:
26           $testhead = MIME::Head->from_file("/tmp/test.hdr");
27           $a_b_head = MIME::Head->from_file("cat a.hdr b.hdr ⎪");
28
29       Output
30
31           ### Output to filehandle:
32           $head->print(\*STDOUT);
33
34           ### Output as string:
35           print STDOUT $head->as_string;
36           print STDOUT $head->stringify;
37
38       Getting field contents
39
40           ### Is this a reply?
41           $is_reply = 1 if ($head->get('Subject') =~ /^Re: /);
42
43           ### Get receipt information:
44           print "Last received from: ", $head->get('Received', 0), "\n";
45           @all_received = $head->get('Received');
46
47           ### Print the subject, or the empty string if none:
48           print "Subject: ", $head->get('Subject',0), "\n";
49
50           ### Too many hops?  Count 'em and see!
51           if ($head->count('Received') > 5) { ...
52
53           ### Test whether a given field exists
54           warn "missing subject!" if (! $head->count('subject'));
55
56       Setting field contents
57
58           ### Declare this to be an HTML header:
59           $head->replace('Content-type', 'text/html');
60
61       Manipulating field contents
62
63           ### Get rid of internal newlines in fields:
64           $head->unfold;
65
66           ### Decode any Q- or B-encoded-text in fields (DEPRECATED):
67           $head->decode;
68
69       Getting high-level MIME information
70
71           ### Get/set a given MIME attribute:
72           unless ($charset = $head->mime_attr('content-type.charset')) {
73               $head->mime_attr("content-type.charset" => "US-ASCII");
74           }
75
76           ### The content type (e.g., "text/html"):
77           $mime_type     = $head->mime_type;
78
79           ### The content transfer encoding (e.g., "quoted-printable"):
80           $mime_encoding = $head->mime_encoding;
81
82           ### The recommended name when extracted:
83           $file_name     = $head->recommended_filename;
84
85           ### The boundary text, for multipart messages:
86           $boundary      = $head->multipart_boundary;
87

DESCRIPTION

89       A class for parsing in and manipulating RFC-822 message headers, with
90       some methods geared towards standard (and not so standard) MIME fields
91       as specified in RFC-1521, Multipurpose Internet Mail Extensions.
92

PUBLIC INTERFACE

94       Creation, input, and output
95
96       new [ARG],[OPTIONS]
97           Class method, inherited.  Creates a new header object.  Arguments
98           are the same as those in the superclass.
99
100       from_file EXPR,OPTIONS
101           Class or instance method.  For convenience, you can use this to
102           parse a header object in from EXPR, which may actually be any
103           expression that can be sent to open() so as to return a readable
104           filehandle.  The "file" will be opened, read, and then closed:
105
106               ### Create a new header by parsing in a file:
107               my $head = MIME::Head->from_file("/tmp/test.hdr");
108
109           Since this method can function as either a class constructor or an
110           instance initializer, the above is exactly equivalent to:
111
112               ### Create a new header by parsing in a file:
113               my $head = MIME::Head->new->from_file("/tmp/test.hdr");
114
115           On success, the object will be returned; on failure, the undefined
116           value.
117
118           The OPTIONS are the same as in new(), and are passed into new() if
119           this is invoked as a class method.
120
121           Note: This is really just a convenience front-end onto "read()",
122           provided mostly for backwards-compatibility with MIME-parser 1.0.
123
124       read FILEHANDLE
125           Instance (or class) method.  This initiallizes a header object by
126           reading it in from a FILEHANDLE, until the terminating blank line
127           is encountered.  A syntax error or end-of-stream will also halt
128           processing.
129
130           Supply this routine with a reference to a filehandle glob; e.g.,
131           "\*STDIN":
132
133               ### Create a new header by parsing in STDIN:
134               $head->read(\*STDIN);
135
136           On success, the self object will be returned; on failure, a false
137           value.
138
139           Note: in the MIME world, it is perfectly legal for a header to be
140           empty, consisting of nothing but the terminating blank line.  Thus,
141           we can't just use the formula that "no tags equals error".
142
143           Warning: as of the time of this writing, Mail::Header::read did not
144           flag either syntax errors or unexpected end-of-file conditions (an
145           EOF before the terminating blank line).  MIME::ParserBase takes
146           this into account.
147
148       Getting/setting fields
149
150       The following are methods related to retrieving and modifying the
151       header fields.  Some are inherited from Mail::Header, but I've kept the
152       documentation around for convenience.
153
154       add TAG,TEXT,[INDEX]
155           Instance method, inherited.  Add a new occurence of the field named
156           TAG, given by TEXT:
157
158               ### Add the trace information:
159               $head->add('Received',
160                          'from eryq.pr.mcs.net by gonzo.net with smtp');
161
162           Normally, the new occurence will be appended to the existing
163           occurences.  However, if the optional INDEX argument is 0, then the
164           new occurence will be prepended.  If you want to be explicit about
165           appending, specify an INDEX of -1.
166
167           Warning: this method always adds new occurences; it doesn't over‐
168           write any existing occurences... so if you just want to change the
169           value of a field (creating it if necessary), then you probably
170           don't want to use this method: consider using "replace()" instead.
171
172       count TAG
173           Instance method, inherited.  Returns the number of occurences of a
174           field; in a boolean context, this tells you whether a given field
175           exists:
176
177               ### Was a "Subject:" field given?
178               $subject_was_given = $head->count('subject');
179
180           The TAG is treated in a case-insensitive manner.  This method
181           returns some false value if the field doesn't exist, and some true
182           value if it does.
183
184       decode [FORCE]
185           Instance method, DEPRECATED.  Go through all the header fields,
186           looking for RFC-1522-style "Q" (quoted-printable, sort of) or "B"
187           (base64) encoding, and decode them in-place.  Fellow Americans, you
188           probably don't know what the hell I'm talking about.  Europeans,
189           Russians, et al, you probably do.  ":-)".
190
191           This method has been deprecated.  See "decode_headers" in
192           MIME::Parser for the full reasons.  If you absolutely must use it
193           and don't like the warning, then provide a FORCE:
194
195              "I_NEED_TO_FIX_THIS"
196                     Just shut up and do it.  Not recommended.
197                     Provided only for those who need to keep old scripts functioning.
198
199              "I_KNOW_WHAT_I_AM_DOING"
200                     Just shut up and do it.  Not recommended.
201                     Provided for those who REALLY know what they are doing.
202
203           What this method does.  For an example, let's consider a valid
204           email header you might get:
205
206               From: =?US-ASCII?Q?Keith_Moore?= <moore@cs.utk.edu>
207               To: =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= <keld@dkuug.dk>
208               CC: =?ISO-8859-1?Q?Andr=E9_?= Pirard <PIRARD@vm1.ulg.ac.be>
209               Subject: =?ISO-8859-1?B?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?=
210                =?ISO-8859-2?B?dSB1bmRlcnN0YW5kIHRoZSBleGFtcGxlLg==?=
211                =?US-ASCII?Q?.._cool!?=
212
213           That basically decodes to (sorry, I can only approximate the Latin
214           characters with 7 bit sequences /o and 'e):
215
216               From: Keith Moore <moore@cs.utk.edu>
217               To: Keld J/orn Simonsen <keld@dkuug.dk>
218               CC: Andr'e  Pirard <PIRARD@vm1.ulg.ac.be>
219               Subject: If you can read this you understand the example... cool!
220
221           Note: currently, the decodings are done without regard to the char‐
222           acter set: thus, the Q-encoding "=F8" is simply translated to the
223           octet (hexadecimal "F8"), period.  For piece-by-piece decoding of a
224           given field, you want the array context of
225           "MIME::Word::decode_mimewords()".
226
227           Warning: the CRLF+SPACE separator that splits up long encoded words
228           into shorter sequences (see the Subject: example above) gets lost
229           when the field is unfolded, and so decoding after unfolding causes
230           a spurious space to be left in the field.  THEREFORE: if you're
231           going to decode, do so BEFORE unfolding!
232
233           This method returns the self object.
234
235           Thanks to Kent Boortz for providing the idea, and the baseline
236           RFC-1522-decoding code.
237
238       delete TAG,[INDEX]
239           Instance method, inherited.  Delete all occurences of the field
240           named TAG.
241
242               ### Remove some MIME information:
243               $head->delete('MIME-Version');
244               $head->delete('Content-type');
245
246       get TAG,[INDEX]
247           Instance method, inherited.  Get the contents of field TAG.
248
249           If a numeric INDEX is given, returns the occurence at that index,
250           or undef if not present:
251
252               ### Print the first and last 'Received:' entries (explicitly):
253               print "First, or most recent: ", $head->get('received', 0), "\n";
254               print "Last, or least recent: ", $head->get('received',-1), "\n";
255
256           If no INDEX is given, but invoked in a scalar context, then INDEX
257           simply defaults to 0:
258
259               ### Get the first 'Received:' entry (implicitly):
260               my $most_recent = $head->get('received');
261
262           If no INDEX is given, and invoked in an array context, then all
263           occurences of the field are returned:
264
265               ### Get all 'Received:' entries:
266               my @all_received = $head->get('received');
267
268       get_all FIELD
269           Instance method.  Returns the list of all occurences of the field,
270           or the empty list if the field is not present:
271
272               ### How did it get here?
273               @history = $head->get_all('Received');
274
275           Note: I had originally experimented with having "get()" return all
276           occurences when invoked in an array context... but that causes a
277           lot of accidents when you get careless and do stuff like this:
278
279               print "\u$field: ", $head->get($field), "\n";
280
281           It also made the intuitive behaviour unclear if the INDEX argument
282           was given in an array context.  So I opted for an explicit approach
283           to asking for all occurences.
284
285       print [OUTSTREAM]
286           Instance method, override.  Print the header out to the given OUT‐
287           STREAM, or the currently-selected filehandle if none.  The OUT‐
288           STREAM may be a filehandle, or any object that responds to a
289           print() message.
290
291           The override actually lets you print to any object that responds to
292           a print() method.  This is vital for outputting MIME entities to
293           scalars.
294
295           Also, it defaults to the currently-selected filehandle if none is
296           given (not STDOUT!), so please supply a filehandle to prevent con‐
297           fusion.
298
299       stringify
300           Instance method.  Return the header as a string.  You can also
301           invoke it as "as_string".
302
303       unfold [FIELD]
304           Instance method, inherited.  Unfold (remove newlines in) the text
305           of all occurences of the given FIELD.  If the FIELD is omitted, all
306           fields are unfolded.  Returns the "self" object.
307
308       MIME-specific methods
309
310       All of the following methods extract information from the following
311       fields:
312
313           Content-type
314           Content-transfer-encoding
315           Content-disposition
316
317       Be aware that they do not just return the raw contents of those fields,
318       and in some cases they will fill in sensible (I hope) default values.
319       Use "get()" or "mime_attr()" if you need to grab and process the raw
320       field text.
321
322       Note: some of these methods are provided both as a convenience and for
323       backwards-compatibility only, while others (like recommended_file‐
324       name()) really do have to be in MIME::Head to work properly, since they
325       look for their value in more than one field.  However, if you know that
326       a value is restricted to a single field, you should really use the
327       Mail::Field interface to get it.
328
329       mime_attr ATTR,[VALUE]
330           A quick-and-easy interface to set/get the attributes in structured
331           MIME fields:
332
333               $head->mime_attr("content-type"         => "text/html");
334               $head->mime_attr("content-type.charset" => "US-ASCII");
335               $head->mime_attr("content-type.name"    => "homepage.html");
336
337           This would cause the final output to look something like this:
338
339               Content-type: text/html; charset=US-ASCII; name="homepage.html"
340
341           Note that the special empty sub-field tag indicates the anonymous
342           first sub-field.
343
344           Giving VALUE as undefined will cause the contents of the named sub‐
345           field to be deleted:
346
347               $head->mime_attr("content-type.charset" => undef);
348
349           Supplying no VALUE argument just returns the attribute's value, or
350           undefined if it isn't there:
351
352               $type = $head->mime_attr("content-type");      ### text/html
353               $name = $head->mime_attr("content-type.name"); ### homepage.html
354
355           In all cases, the new/current value is returned.
356
357       mime_encoding
358           Instance method.  Try real hard to determine the content transfer
359           encoding (e.g., "base64", "binary"), which is returned in all-low‐
360           ercase.
361
362           If no encoding could be found, the default of "7bit" is returned.
363           I quote from RFC-1521 section 5:
364
365               This is the default value -- that is, "Content-Transfer-Encoding: 7BIT"
366               is assumed if the Content-Transfer-Encoding header field is not present.
367
368           I do one other form of fixup: "7_bit", "7-bit", and "7 bit" are
369           corrected to "7bit"; likewise for "8bit".
370
371       mime_type [DEFAULT]
372           Instance method.  Try "real hard" to determine the content type
373           (e.g., "text/plain", "image/gif", "x-weird-type", which is returned
374           in all-lowercase.  "Real hard" means that if no content type could
375           be found, the default (usually "text/plain") is returned.  From
376           RFC-1521 section 7.1:
377
378               The default Content-Type for Internet mail is
379               "text/plain; charset=us-ascii".
380
381           Unless this is a part of a "multipart/digest", in which case "mes‐
382           sage/rfc822" is the default.  Note that you can also set the
383           default, but you shouldn't: normally only the MIME parser uses this
384           feature.
385
386       multipart_boundary
387           Instance method.  If this is a header for a multipart message,
388           return the "encapsulation boundary" used to separate the parts.
389           The boundary is returned exactly as given in the "Content-type:"
390           field; that is, the leading double-hyphen ("--") is not prepended.
391
392           Well, almost exactly... this passage from RFC-1521 dictates that we
393           remove any trailing spaces:
394
395              If a boundary appears to end with white space, the white space
396              must be presumed to have been added by a gateway, and must be deleted.
397
398           Returns undef (not the empty string) if either the message is not
399           multipart or if there is no specified boundary.
400
401       recommended_filename
402           Instance method.  Return the recommended external filename.  This
403           is used when extracting the data from the MIME stream.
404
405           Returns undef if no filename could be suggested.
406

NOTES

408       Why have separate objects for the entity, head, and body?
409           See the documentation for the MIME-tools distribution for the
410           rationale behind this decision.
411
412       Why assume that MIME headers are email headers?
413           I quote from Achim Bohnet, who gave feedback on v.1.9 (I think he's
414           using the word "header" where I would use "field"; e.g., to refer
415           to "Subject:", "Content-type:", etc.):
416
417               There is also IMHO no requirement [for] MIME::Heads to look
418               like [email] headers; so to speak, the MIME::Head [simply stores]
419               the attributes of a complex object, e.g.:
420
421                   new MIME::Head type => "text/plain",
422                                  charset => ...,
423                                  disposition => ..., ... ;
424
425           I agree in principle, but (alas and dammit) RFC-1521 says other‐
426           wise.  RFC-1521 [MIME] headers are a syntactic subset of RFC-822
427           [email] headers.  Perhaps a better name for these modules would be
428           RFC1521:: instead of MIME::, but we're a little beyond that stage
429           now.
430
431           In my mind's eye, I see an abstract class, call it MIME::Attrs,
432           which does what Achim suggests... so you could say:
433
434                my $attrs = new MIME::Attrs type => "text/plain",
435                                            charset => ...,
436                                            disposition => ..., ... ;
437
438           We could even make it a superclass of MIME::Head: that way,
439           MIME::Head would have to implement its interface, and allow itself
440           to be initiallized from a MIME::Attrs object.
441
442           However, when you read RFC-1521, you begin to see how much MIME
443           information is organized by its presence in particular fields.  I
444           imagine that we'd begin to mirror the structure of RFC-1521 fields
445           and subfields to such a degree that this might not give us a
446           tremendous gain over just having MIME::Head.
447
448       Why all this "occurence" and "index" jazz?  Isn't every field unique?
449           Aaaaaaaaaahh....no.
450
451           Looking at a typical mail message header, it is sooooooo tempting
452           to just store the fields as a hash of strings, one string per hash
453           entry.  Unfortunately, there's the little matter of the "Received:"
454           field, which (unlike "From:", "To:", etc.) will often have multiple
455           occurences; e.g.:
456
457               Received: from gsfc.nasa.gov by eryq.pr.mcs.net  with smtp
458                   (Linux Smail3.1.28.1 #5) id m0tStZ7-0007X4C;
459                    Thu, 21 Dec 95 16:34 CST
460               Received: from rhine.gsfc.nasa.gov by gsfc.nasa.gov
461                    (5.65/Ultrix3.0-C) id AA13596;
462                    Thu, 21 Dec 95 17:20:38 -0500
463               Received: (from eryq@localhost) by rhine.gsfc.nasa.gov
464                    (8.6.12/8.6.12) id RAA28069;
465                    Thu, 21 Dec 1995 17:27:54 -0500
466               Date: Thu, 21 Dec 1995 17:27:54 -0500
467               From: Eryq <eryq@rhine.gsfc.nasa.gov>
468               Message-Id: <199512212227.RAA28069@rhine.gsfc.nasa.gov>
469               To: eryq@eryq.pr.mcs.net
470               Subject: Stuff and things
471
472           The "Received:" field is used for tracing message routes, and
473           although it's not generally used for anything other than human
474           debugging, I didn't want to inconvenience anyone who actually
475           wanted to get at that information.
476
477           I also didn't want to make this a special case; after all, who
478           knows what other fields could have multiple occurences in the
479           future?  So, clearly, multiple entries had to somehow be stored
480           multiple times... and the different occurences had to be retriev‐
481           able.
482

AUTHOR

484       Eryq (eryq@zeegee.com), ZeeGee Software Inc (http://www.zeegee.com).
485       David F. Skoll (dfs@roaringpenguin.com) http://www.roaringpenguin.com
486
487       All rights reserved.  This program is free software; you can redis‐
488       tribute it and/or modify it under the same terms as Perl itself.
489
490       The more-comprehensive filename extraction is courtesy of Lee E. Brotz‐
491       man, Advanced Data Solutions.
492

VERSION

494       $Revision: 1.14 $ $Date: 2006/03/17 21:03:23 $
495
496
497
498perl v5.8.8                       2006-03-17                     MIME::Head(3)