MIME::Head(3pm)

1MIME::Head(3)         User Contributed Perl Documentation        MIME::Head(3)
2
3
4

NAME

6       MIME::Head - MIME message header (a subclass of Mail::Header)
7

SYNOPSIS

9       Before reading further, you should see MIME::Tools to make sure that
10       you understand where this module fits into the grand scheme of things.
11       Go on, do it now.  I'll wait.
12
13       Ready?  Ok...
14
15   Construction
16           ### Create a new, empty header, and populate it manually:
17           $head = MIME::Head->new;
18           $head->replace('content-type', 'text/plain; charset=US-ASCII');
19           $head->replace('content-length', $len);
20
21           ### Parse a new header from a filehandle:
22           $head = MIME::Head->read(\*STDIN);
23
24           ### Parse a new header from a file, or a readable pipe:
25           $testhead = MIME::Head->from_file("/tmp/test.hdr");
26           $a_b_head = MIME::Head->from_file("cat a.hdr b.hdr |");
27
28   Output
29           ### Output to filehandle:
30           $head->print(\*STDOUT);
31
32           ### Output as string:
33           print STDOUT $head->as_string;
34           print STDOUT $head->stringify;
35
36   Getting field contents
37           ### Is this a reply?
38           $is_reply = 1 if ($head->get('Subject') =~ /^Re: /);
39
40           ### Get receipt information:
41           print "Last received from: ", $head->get('Received', 0);
42           @all_received = $head->get('Received');
43
44           ### Print the subject, or the empty string if none:
45           print "Subject: ", $head->get('Subject',0);
46
47           ### Too many hops?  Count 'em and see!
48           if ($head->count('Received') > 5) { ...
49
50           ### Test whether a given field exists
51           warn "missing subject!" if (! $head->count('subject'));
52
53   Setting field contents
54           ### Declare this to be an HTML header:
55           $head->replace('Content-type', 'text/html');
56
57   Manipulating field contents
58           ### Get rid of internal newlines in fields:
59           $head->unfold;
60
61           ### Decode any Q- or B-encoded-text in fields (DEPRECATED):
62           $head->decode;
63
64   Getting high-level MIME information
65           ### Get/set a given MIME attribute:
66           unless ($charset = $head->mime_attr('content-type.charset')) {
67               $head->mime_attr("content-type.charset" => "US-ASCII");
68           }
69
70           ### The content type (e.g., "text/html"):
71           $mime_type     = $head->mime_type;
72
73           ### The content transfer encoding (e.g., "quoted-printable"):
74           $mime_encoding = $head->mime_encoding;
75
76           ### The recommended name when extracted:
77           $file_name     = $head->recommended_filename;
78
79           ### The boundary text, for multipart messages:
80           $boundary      = $head->multipart_boundary;
81

DESCRIPTION

83       A class for parsing in and manipulating RFC-822 message headers, with
84       some methods geared towards standard (and not so standard) MIME fields
85       as specified in the various Multipurpose Internet Mail Extensions RFCs
86       (starting with RFC 2045)
87

PUBLIC INTERFACE

89   Creation, input, and output
90       new [ARG],[OPTIONS]
91           Class method, inherited.  Creates a new header object.  Arguments
92           are the same as those in the superclass.
93
94       from_file EXPR,OPTIONS
95           Class or instance method.  For convenience, you can use this to
96           parse a header object in from EXPR, which may actually be any
97           expression that can be sent to open() so as to return a readable
98           filehandle.  The "file" will be opened, read, and then closed:
99
100               ### Create a new header by parsing in a file:
101               my $head = MIME::Head->from_file("/tmp/test.hdr");
102
103           Since this method can function as either a class constructor or an
104           instance initializer, the above is exactly equivalent to:
105
106               ### Create a new header by parsing in a file:
107               my $head = MIME::Head->new->from_file("/tmp/test.hdr");
108
109           On success, the object will be returned; on failure, the undefined
110           value.
111
112           The OPTIONS are the same as in new(), and are passed into new() if
113           this is invoked as a class method.
114
115           Note: This is really just a convenience front-end onto "read()",
116           provided mostly for backwards-compatibility with MIME-parser 1.0.
117
118       read FILEHANDLE
119           Instance (or class) method.  This initiallizes a header object by
120           reading it in from a FILEHANDLE, until the terminating blank line
121           is encountered.  A syntax error or end-of-stream will also halt
122           processing.
123
124           Supply this routine with a reference to a filehandle glob; e.g.,
125           "\*STDIN":
126
127               ### Create a new header by parsing in STDIN:
128               $head->read(\*STDIN);
129
130           On success, the self object will be returned; on failure, a false
131           value.
132
133           Note: in the MIME world, it is perfectly legal for a header to be
134           empty, consisting of nothing but the terminating blank line.  Thus,
135           we can't just use the formula that "no tags equals error".
136
137           Warning: as of the time of this writing, Mail::Header::read did not
138           flag either syntax errors or unexpected end-of-file conditions (an
139           EOF before the terminating blank line).  MIME::ParserBase takes
140           this into account.
141
142   Getting/setting fields
143       The following are methods related to retrieving and modifying the
144       header fields.  Some are inherited from Mail::Header, but I've kept the
145       documentation around for convenience.
146
147       add TAG,TEXT,[INDEX]
148           Instance method, inherited.  Add a new occurence of the field named
149           TAG, given by TEXT:
150
151               ### Add the trace information:
152               $head->add('Received',
153                          'from eryq.pr.mcs.net by gonzo.net with smtp');
154
155           Normally, the new occurence will be appended to the existing
156           occurences.  However, if the optional INDEX argument is 0, then the
157           new occurence will be prepended.  If you want to be explicit about
158           appending, specify an INDEX of -1.
159
160           Warning: this method always adds new occurences; it doesn't
161           overwrite any existing occurences... so if you just want to change
162           the value of a field (creating it if necessary), then you probably
163           don't want to use this method: consider using "replace()" instead.
164
165       count TAG
166           Instance method, inherited.  Returns the number of occurences of a
167           field; in a boolean context, this tells you whether a given field
168           exists:
169
170               ### Was a "Subject:" field given?
171               $subject_was_given = $head->count('subject');
172
173           The TAG is treated in a case-insensitive manner.  This method
174           returns some false value if the field doesn't exist, and some true
175           value if it does.
176
177       decode [FORCE]
178           Instance method, DEPRECATED.  Go through all the header fields,
179           looking for RFC 1522 / RFC 2047 style "Q" (quoted-printable, sort
180           of) or "B" (base64) encoding, and decode them in-place.  Fellow
181           Americans, you probably don't know what the hell I'm talking about.
182           Europeans, Russians, et al, you probably do.  ":-)".
183
184           This method has been deprecated.  See "decode_headers" in
185           MIME::Parser for the full reasons.  If you absolutely must use it
186           and don't like the warning, then provide a FORCE:
187
188              "I_NEED_TO_FIX_THIS"
189                     Just shut up and do it.  Not recommended.
190                     Provided only for those who need to keep old scripts functioning.
191
192              "I_KNOW_WHAT_I_AM_DOING"
193                     Just shut up and do it.  Not recommended.
194                     Provided for those who REALLY know what they are doing.
195
196           What this method does.  For an example, let's consider a valid
197           email header you might get:
198
199               From: =?US-ASCII?Q?Keith_Moore?= <moore@cs.utk.edu>
200               To: =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= <keld@dkuug.dk>
201               CC: =?ISO-8859-1?Q?Andr=E9_?= Pirard <PIRARD@vm1.ulg.ac.be>
202               Subject: =?ISO-8859-1?B?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?=
203                =?ISO-8859-2?B?dSB1bmRlcnN0YW5kIHRoZSBleGFtcGxlLg==?=
204                =?US-ASCII?Q?.._cool!?=
205
206           That basically decodes to (sorry, I can only approximate the Latin
207           characters with 7 bit sequences /o and 'e):
208
209               From: Keith Moore <moore@cs.utk.edu>
210               To: Keld J/orn Simonsen <keld@dkuug.dk>
211               CC: Andr'e  Pirard <PIRARD@vm1.ulg.ac.be>
212               Subject: If you can read this you understand the example... cool!
213
214           Note: currently, the decodings are done without regard to the
215           character set: thus, the Q-encoding "=F8" is simply translated to
216           the octet (hexadecimal "F8"), period.  For piece-by-piece decoding
217           of a given field, you want the array context of
218           "MIME::Word::decode_mimewords()".
219
220           Warning: the CRLF+SPACE separator that splits up long encoded words
221           into shorter sequences (see the Subject: example above) gets lost
222           when the field is unfolded, and so decoding after unfolding causes
223           a spurious space to be left in the field.  THEREFORE: if you're
224           going to decode, do so BEFORE unfolding!
225
226           This method returns the self object.
227
228           Thanks to Kent Boortz for providing the idea, and the baseline
229           RFC-1522-decoding code.
230
231       delete TAG,[INDEX]
232           Instance method, inherited.  Delete all occurences of the field
233           named TAG.
234
235               ### Remove some MIME information:
236               $head->delete('MIME-Version');
237               $head->delete('Content-type');
238
239       get TAG,[INDEX]
240           Instance method, inherited.  Get the contents of field TAG.
241
242           If a numeric INDEX is given, returns the occurence at that index,
243           or undef if not present:
244
245               ### Print the first and last 'Received:' entries (explicitly):
246               print "First, or most recent: ", $head->get('received', 0);
247               print "Last, or least recent: ", $head->get('received',-1);
248
249           If no INDEX is given, but invoked in a scalar context, then INDEX
250           simply defaults to 0:
251
252               ### Get the first 'Received:' entry (implicitly):
253               my $most_recent = $head->get('received');
254
255           If no INDEX is given, and invoked in an array context, then all
256           occurences of the field are returned:
257
258               ### Get all 'Received:' entries:
259               my @all_received = $head->get('received');
260
261       get_all FIELD
262           Instance method.  Returns the list of all occurences of the field,
263           or the empty list if the field is not present:
264
265               ### How did it get here?
266               @history = $head->get_all('Received');
267
268           Note: I had originally experimented with having "get()" return all
269           occurences when invoked in an array context... but that causes a
270           lot of accidents when you get careless and do stuff like this:
271
272               print "\u$field: ", $head->get($field);
273
274           It also made the intuitive behaviour unclear if the INDEX argument
275           was given in an array context.  So I opted for an explicit approach
276           to asking for all occurences.
277
278       print [OUTSTREAM]
279           Instance method, override.  Print the header out to the given
280           OUTSTREAM, or the currently-selected filehandle if none.  The
281           OUTSTREAM may be a filehandle, or any object that responds to a
282           print() message.
283
284           The override actually lets you print to any object that responds to
285           a print() method.  This is vital for outputting MIME entities to
286           scalars.
287
288           Also, it defaults to the currently-selected filehandle if none is
289           given (not STDOUT!), so please supply a filehandle to prevent
290           confusion.
291
292       stringify
293           Instance method.  Return the header as a string.  You can also
294           invoke it as "as_string".
295
296       unfold [FIELD]
297           Instance method, inherited.  Unfold (remove newlines in) the text
298           of all occurences of the given FIELD.  If the FIELD is omitted, all
299           fields are unfolded.  Returns the "self" object.
300
301   MIME-specific methods
302       All of the following methods extract information from the following
303       fields:
304
305           Content-type
306           Content-transfer-encoding
307           Content-disposition
308
309       Be aware that they do not just return the raw contents of those fields,
310       and in some cases they will fill in sensible (I hope) default values.
311       Use "get()" or "mime_attr()" if you need to grab and process the raw
312       field text.
313
314       Note: some of these methods are provided both as a convenience and for
315       backwards-compatibility only, while others (like
316       recommended_filename()) really do have to be in MIME::Head to work
317       properly, since they look for their value in more than one field.
318       However, if you know that a value is restricted to a single field, you
319       should really use the Mail::Field interface to get it.
320
321       mime_attr ATTR,[VALUE]
322           A quick-and-easy interface to set/get the attributes in structured
323           MIME fields:
324
325               $head->mime_attr("content-type"         => "text/html");
326               $head->mime_attr("content-type.charset" => "US-ASCII");
327               $head->mime_attr("content-type.name"    => "homepage.html");
328
329           This would cause the final output to look something like this:
330
331               Content-type: text/html; charset=US-ASCII; name="homepage.html"
332
333           Note that the special empty sub-field tag indicates the anonymous
334           first sub-field.
335
336           Giving VALUE as undefined will cause the contents of the named
337           subfield to be deleted:
338
339               $head->mime_attr("content-type.charset" => undef);
340
341           Supplying no VALUE argument just returns the attribute's value, or
342           undefined if it isn't there:
343
344               $type = $head->mime_attr("content-type");      ### text/html
345               $name = $head->mime_attr("content-type.name"); ### homepage.html
346
347           In all cases, the new/current value is returned.
348
349       mime_encoding
350           Instance method.  Try real hard to determine the content transfer
351           encoding (e.g., "base64", "binary"), which is returned in all-
352           lowercase.
353
354           If no encoding could be found, the default of "7bit" is returned I
355           quote from RFC 2045 section 6.1:
356
357               This is the default value -- that is, "Content-Transfer-Encoding: 7BIT"
358               is assumed if the Content-Transfer-Encoding header field is not present.
359
360           I do one other form of fixup: "7_bit", "7-bit", and "7 bit" are
361           corrected to "7bit"; likewise for "8bit".
362
363       mime_type [DEFAULT]
364           Instance method.  Try "real hard" to determine the content type
365           (e.g., "text/plain", "image/gif", "x-weird-type", which is returned
366           in all-lowercase.  "Real hard" means that if no content type could
367           be found, the default (usually "text/plain") is returned.  From RFC
368           2045 section 5.2:
369
370              Default RFC 822 messages without a MIME Content-Type header are
371              taken by this protocol to be plain text in the US-ASCII character
372              set, which can be explicitly specified as:
373
374                 Content-type: text/plain; charset=us-ascii
375
376              This default is assumed if no Content-Type header field is specified.
377
378           Unless this is a part of a "multipart/digest", in which case
379           "message/rfc822" is the default.  Note that you can also set the
380           default, but you shouldn't: normally only the MIME parser uses this
381           feature.
382
383       multipart_boundary
384           Instance method.  If this is a header for a multipart message,
385           return the "encapsulation boundary" used to separate the parts.
386           The boundary is returned exactly as given in the "Content-type:"
387           field; that is, the leading double-hyphen ("--") is not prepended.
388
389           Well, almost exactly... this passage from RFC 2046 dictates that we
390           remove any trailing spaces:
391
392              If a boundary appears to end with white space, the white space
393              must be presumed to have been added by a gateway, and must be deleted.
394
395           Returns undef (not the empty string) if either the message is not
396           multipart or if there is no specified boundary.
397
398       recommended_filename
399           Instance method.  Return the recommended external filename.  This
400           is used when extracting the data from the MIME stream.
401
402           Returns undef if no filename could be suggested.
403

NOTES

405       Why have separate objects for the entity, head, and body?
406           See the documentation for the MIME-tools distribution for the
407           rationale behind this decision.
408
409       Why assume that MIME headers are email headers?
410           I quote from Achim Bohnet, who gave feedback on v.1.9 (I think he's
411           using the word "header" where I would use "field"; e.g., to refer
412           to "Subject:", "Content-type:", etc.):
413
414               There is also IMHO no requirement [for] MIME::Heads to look
415               like [email] headers; so to speak, the MIME::Head [simply stores]
416               the attributes of a complex object, e.g.:
417
418                   new MIME::Head type => "text/plain",
419                                  charset => ...,
420                                  disposition => ..., ... ;
421
422           I agree in principle, but (alas and dammit) RFC 2045 says
423           otherwise.  RFC 2045 [MIME] headers are a syntactic subset of
424           RFC-822 [email] headers.
425
426           In my mind's eye, I see an abstract class, call it MIME::Attrs,
427           which does what Achim suggests... so you could say:
428
429                my $attrs = new MIME::Attrs type => "text/plain",
430                                            charset => ...,
431                                            disposition => ..., ... ;
432
433           We could even make it a superclass of MIME::Head: that way,
434           MIME::Head would have to implement its interface, and allow itself
435           to be initiallized from a MIME::Attrs object.
436
437           However, when you read RFC 2045, you begin to see how much MIME
438           information is organized by its presence in particular fields.  I
439           imagine that we'd begin to mirror the structure of RFC 2045 fields
440           and subfields to such a degree that this might not give us a
441           tremendous gain over just having MIME::Head.
442
443       Why all this "occurence" and "index" jazz?  Isn't every field unique?
444           Aaaaaaaaaahh....no.
445
446           Looking at a typical mail message header, it is sooooooo tempting
447           to just store the fields as a hash of strings, one string per hash
448           entry.  Unfortunately, there's the little matter of the "Received:"
449           field, which (unlike "From:", "To:", etc.) will often have multiple
450           occurences; e.g.:
451
452               Received: from gsfc.nasa.gov by eryq.pr.mcs.net  with smtp
453                   (Linux Smail3.1.28.1 #5) id m0tStZ7-0007X4C;
454                    Thu, 21 Dec 95 16:34 CST
455               Received: from rhine.gsfc.nasa.gov by gsfc.nasa.gov
456                    (5.65/Ultrix3.0-C) id AA13596;
457                    Thu, 21 Dec 95 17:20:38 -0500
458               Received: (from eryq@localhost) by rhine.gsfc.nasa.gov
459                    (8.6.12/8.6.12) id RAA28069;
460                    Thu, 21 Dec 1995 17:27:54 -0500
461               Date: Thu, 21 Dec 1995 17:27:54 -0500
462               From: Eryq <eryq@rhine.gsfc.nasa.gov>
463               Message-Id: <199512212227.RAA28069@rhine.gsfc.nasa.gov>
464               To: eryq@eryq.pr.mcs.net
465               Subject: Stuff and things
466
467           The "Received:" field is used for tracing message routes, and
468           although it's not generally used for anything other than human
469           debugging, I didn't want to inconvenience anyone who actually
470           wanted to get at that information.
471
472           I also didn't want to make this a special case; after all, who
473           knows what other fields could have multiple occurences in the
474           future?  So, clearly, multiple entries had to somehow be stored
475           multiple times... and the different occurences had to be
476           retrievable.
477

AUTHOR

482       Eryq (eryq@zeegee.com), ZeeGee Software Inc (http://www.zeegee.com).
483       David F. Skoll (dfs@roaringpenguin.com) http://www.roaringpenguin.com
484
485       All rights reserved.  This program is free software; you can
486       redistribute it and/or modify it under the same terms as Perl itself.
487
488       The more-comprehensive filename extraction is courtesy of Lee E.
489       Brotzman, Advanced Data Solutions.
490
491
492
493perl v5.12.0                      2010-04-22                     MIME::Head(3)

NAME

SYNOPSIS

DESCRIPTION

PUBLIC INTERFACE

NOTES

SEE ALSO

AUTHOR