1MIME::Head(3) User Contributed Perl Documentation MIME::Head(3)
2
3
4
6 MIME::Head - MIME message header (a subclass of Mail::Header)
7
9 Before reading further, you should see MIME::Tools to make sure that
10 you understand where this module fits into the grand scheme of things.
11 Go on, do it now. I'll wait.
12
13 Ready? Ok...
14
15 Construction
16 ### Create a new, empty header, and populate it manually:
17 $head = MIME::Head->new;
18 $head->replace('content-type', 'text/plain; charset=US-ASCII');
19 $head->replace('content-length', $len);
20
21 ### Parse a new header from a filehandle:
22 $head = MIME::Head->read(\*STDIN);
23
24 ### Parse a new header from a file, or a readable pipe:
25 $testhead = MIME::Head->from_file("/tmp/test.hdr");
26 $a_b_head = MIME::Head->from_file("cat a.hdr b.hdr |");
27
28 Output
29 ### Output to filehandle:
30 $head->print(\*STDOUT);
31
32 ### Output as string:
33 print STDOUT $head->as_string;
34 print STDOUT $head->stringify;
35
36 Getting field contents
37 ### Is this a reply?
38 $is_reply = 1 if ($head->get('Subject') =~ /^Re: /);
39
40 ### Get receipt information:
41 print "Last received from: ", $head->get('Received', 0);
42 @all_received = $head->get('Received');
43
44 ### Print the subject, or the empty string if none:
45 print "Subject: ", $head->get('Subject',0);
46
47 ### Too many hops? Count 'em and see!
48 if ($head->count('Received') > 5) { ...
49
50 ### Test whether a given field exists
51 warn "missing subject!" if (! $head->count('subject'));
52
53 Setting field contents
54 ### Declare this to be an HTML header:
55 $head->replace('Content-type', 'text/html');
56
57 Manipulating field contents
58 ### Get rid of internal newlines in fields:
59 $head->unfold;
60
61 ### Decode any Q- or B-encoded-text in fields (DEPRECATED):
62 $head->decode;
63
64 Getting high-level MIME information
65 ### Get/set a given MIME attribute:
66 unless ($charset = $head->mime_attr('content-type.charset')) {
67 $head->mime_attr("content-type.charset" => "US-ASCII");
68 }
69
70 ### The content type (e.g., "text/html"):
71 $mime_type = $head->mime_type;
72
73 ### The content transfer encoding (e.g., "quoted-printable"):
74 $mime_encoding = $head->mime_encoding;
75
76 ### The recommended name when extracted:
77 $file_name = $head->recommended_filename;
78
79 ### The boundary text, for multipart messages:
80 $boundary = $head->multipart_boundary;
81
83 A class for parsing in and manipulating RFC-822 message headers, with
84 some methods geared towards standard (and not so standard) MIME fields
85 as specified in the various Multipurpose Internet Mail Extensions RFCs
86 (starting with RFC 2045)
87
89 Creation, input, and output
90 new [ARG],[OPTIONS]
91 Class method, inherited. Creates a new header object. Arguments
92 are the same as those in the superclass.
93
94 from_file EXPR,OPTIONS
95 Class or instance method. For convenience, you can use this to
96 parse a header object in from EXPR, which may actually be any
97 expression that can be sent to open() so as to return a readable
98 filehandle. The "file" will be opened, read, and then closed:
99
100 ### Create a new header by parsing in a file:
101 my $head = MIME::Head->from_file("/tmp/test.hdr");
102
103 Since this method can function as either a class constructor or an
104 instance initializer, the above is exactly equivalent to:
105
106 ### Create a new header by parsing in a file:
107 my $head = MIME::Head->new->from_file("/tmp/test.hdr");
108
109 On success, the object will be returned; on failure, the undefined
110 value.
111
112 The OPTIONS are the same as in new(), and are passed into new() if
113 this is invoked as a class method.
114
115 Note: This is really just a convenience front-end onto "read()",
116 provided mostly for backwards-compatibility with MIME-parser 1.0.
117
118 read FILEHANDLE
119 Instance (or class) method. This initiallizes a header object by
120 reading it in from a FILEHANDLE, until the terminating blank line
121 is encountered. A syntax error or end-of-stream will also halt
122 processing.
123
124 Supply this routine with a reference to a filehandle glob; e.g.,
125 "\*STDIN":
126
127 ### Create a new header by parsing in STDIN:
128 $head->read(\*STDIN);
129
130 On success, the self object will be returned; on failure, a false
131 value.
132
133 Note: in the MIME world, it is perfectly legal for a header to be
134 empty, consisting of nothing but the terminating blank line. Thus,
135 we can't just use the formula that "no tags equals error".
136
137 Warning: as of the time of this writing, Mail::Header::read did not
138 flag either syntax errors or unexpected end-of-file conditions (an
139 EOF before the terminating blank line). MIME::ParserBase takes
140 this into account.
141
142 Getting/setting fields
143 The following are methods related to retrieving and modifying the
144 header fields. Some are inherited from Mail::Header, but I've kept the
145 documentation around for convenience.
146
147 add TAG,TEXT,[INDEX]
148 Instance method, inherited. Add a new occurence of the field named
149 TAG, given by TEXT:
150
151 ### Add the trace information:
152 $head->add('Received',
153 'from eryq.pr.mcs.net by gonzo.net with smtp');
154
155 Normally, the new occurence will be appended to the existing
156 occurences. However, if the optional INDEX argument is 0, then the
157 new occurence will be prepended. If you want to be explicit about
158 appending, specify an INDEX of -1.
159
160 Warning: this method always adds new occurences; it doesn't
161 overwrite any existing occurences... so if you just want to change
162 the value of a field (creating it if necessary), then you probably
163 don't want to use this method: consider using "replace()" instead.
164
165 count TAG
166 Instance method, inherited. Returns the number of occurences of a
167 field; in a boolean context, this tells you whether a given field
168 exists:
169
170 ### Was a "Subject:" field given?
171 $subject_was_given = $head->count('subject');
172
173 The TAG is treated in a case-insensitive manner. This method
174 returns some false value if the field doesn't exist, and some true
175 value if it does.
176
177 decode [FORCE]
178 Instance method, DEPRECATED. Go through all the header fields,
179 looking for RFC 1522 / RFC 2047 style "Q" (quoted-printable, sort
180 of) or "B" (base64) encoding, and decode them in-place. Fellow
181 Americans, you probably don't know what the hell I'm talking about.
182 Europeans, Russians, et al, you probably do. ":-)".
183
184 This method has been deprecated. See "decode_headers" in
185 MIME::Parser for the full reasons. If you absolutely must use it
186 and don't like the warning, then provide a FORCE:
187
188 "I_NEED_TO_FIX_THIS"
189 Just shut up and do it. Not recommended.
190 Provided only for those who need to keep old scripts functioning.
191
192 "I_KNOW_WHAT_I_AM_DOING"
193 Just shut up and do it. Not recommended.
194 Provided for those who REALLY know what they are doing.
195
196 What this method does. For an example, let's consider a valid
197 email header you might get:
198
199 From: =?US-ASCII?Q?Keith_Moore?= <moore@cs.utk.edu>
200 To: =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= <keld@dkuug.dk>
201 CC: =?ISO-8859-1?Q?Andr=E9_?= Pirard <PIRARD@vm1.ulg.ac.be>
202 Subject: =?ISO-8859-1?B?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?=
203 =?ISO-8859-2?B?dSB1bmRlcnN0YW5kIHRoZSBleGFtcGxlLg==?=
204 =?US-ASCII?Q?.._cool!?=
205
206 That basically decodes to (sorry, I can only approximate the Latin
207 characters with 7 bit sequences /o and 'e):
208
209 From: Keith Moore <moore@cs.utk.edu>
210 To: Keld J/orn Simonsen <keld@dkuug.dk>
211 CC: Andr'e Pirard <PIRARD@vm1.ulg.ac.be>
212 Subject: If you can read this you understand the example... cool!
213
214 Note: currently, the decodings are done without regard to the
215 character set: thus, the Q-encoding "=F8" is simply translated to
216 the octet (hexadecimal "F8"), period. For piece-by-piece decoding
217 of a given field, you want the array context of
218 "MIME::Word::decode_mimewords()".
219
220 Warning: the CRLF+SPACE separator that splits up long encoded words
221 into shorter sequences (see the Subject: example above) gets lost
222 when the field is unfolded, and so decoding after unfolding causes
223 a spurious space to be left in the field. THEREFORE: if you're
224 going to decode, do so BEFORE unfolding!
225
226 This method returns the self object.
227
228 Thanks to Kent Boortz for providing the idea, and the baseline
229 RFC-1522-decoding code.
230
231 delete TAG,[INDEX]
232 Instance method, inherited. Delete all occurences of the field
233 named TAG.
234
235 ### Remove some MIME information:
236 $head->delete('MIME-Version');
237 $head->delete('Content-type');
238
239 get TAG,[INDEX]
240 Instance method, inherited. Get the contents of field TAG.
241
242 If a numeric INDEX is given, returns the occurence at that index,
243 or undef if not present:
244
245 ### Print the first and last 'Received:' entries (explicitly):
246 print "First, or most recent: ", $head->get('received', 0);
247 print "Last, or least recent: ", $head->get('received',-1);
248
249 If no INDEX is given, but invoked in a scalar context, then INDEX
250 simply defaults to 0:
251
252 ### Get the first 'Received:' entry (implicitly):
253 my $most_recent = $head->get('received');
254
255 If no INDEX is given, and invoked in an array context, then all
256 occurences of the field are returned:
257
258 ### Get all 'Received:' entries:
259 my @all_received = $head->get('received');
260
261 get_all FIELD
262 Instance method. Returns the list of all occurences of the field,
263 or the empty list if the field is not present:
264
265 ### How did it get here?
266 @history = $head->get_all('Received');
267
268 Note: I had originally experimented with having "get()" return all
269 occurences when invoked in an array context... but that causes a
270 lot of accidents when you get careless and do stuff like this:
271
272 print "\u$field: ", $head->get($field);
273
274 It also made the intuitive behaviour unclear if the INDEX argument
275 was given in an array context. So I opted for an explicit approach
276 to asking for all occurences.
277
278 print [OUTSTREAM]
279 Instance method, override. Print the header out to the given
280 OUTSTREAM, or the currently-selected filehandle if none. The
281 OUTSTREAM may be a filehandle, or any object that responds to a
282 print() message.
283
284 The override actually lets you print to any object that responds to
285 a print() method. This is vital for outputting MIME entities to
286 scalars.
287
288 Also, it defaults to the currently-selected filehandle if none is
289 given (not STDOUT!), so please supply a filehandle to prevent
290 confusion.
291
292 stringify
293 Instance method. Return the header as a string. You can also
294 invoke it as "as_string".
295
296 unfold [FIELD]
297 Instance method, inherited. Unfold (remove newlines in) the text
298 of all occurences of the given FIELD. If the FIELD is omitted, all
299 fields are unfolded. Returns the "self" object.
300
301 MIME-specific methods
302 All of the following methods extract information from the following
303 fields:
304
305 Content-type
306 Content-transfer-encoding
307 Content-disposition
308
309 Be aware that they do not just return the raw contents of those fields,
310 and in some cases they will fill in sensible (I hope) default values.
311 Use "get()" or "mime_attr()" if you need to grab and process the raw
312 field text.
313
314 Note: some of these methods are provided both as a convenience and for
315 backwards-compatibility only, while others (like
316 recommended_filename()) really do have to be in MIME::Head to work
317 properly, since they look for their value in more than one field.
318 However, if you know that a value is restricted to a single field, you
319 should really use the Mail::Field interface to get it.
320
321 mime_attr ATTR,[VALUE]
322 A quick-and-easy interface to set/get the attributes in structured
323 MIME fields:
324
325 $head->mime_attr("content-type" => "text/html");
326 $head->mime_attr("content-type.charset" => "US-ASCII");
327 $head->mime_attr("content-type.name" => "homepage.html");
328
329 This would cause the final output to look something like this:
330
331 Content-type: text/html; charset=US-ASCII; name="homepage.html"
332
333 Note that the special empty sub-field tag indicates the anonymous
334 first sub-field.
335
336 Giving VALUE as undefined will cause the contents of the named
337 subfield to be deleted:
338
339 $head->mime_attr("content-type.charset" => undef);
340
341 Supplying no VALUE argument just returns the attribute's value, or
342 undefined if it isn't there:
343
344 $type = $head->mime_attr("content-type"); ### text/html
345 $name = $head->mime_attr("content-type.name"); ### homepage.html
346
347 In all cases, the new/current value is returned.
348
349 mime_encoding
350 Instance method. Try real hard to determine the content transfer
351 encoding (e.g., "base64", "binary"), which is returned in all-
352 lowercase.
353
354 If no encoding could be found, the default of "7bit" is returned I
355 quote from RFC 2045 section 6.1:
356
357 This is the default value -- that is, "Content-Transfer-Encoding: 7BIT"
358 is assumed if the Content-Transfer-Encoding header field is not present.
359
360 I do one other form of fixup: "7_bit", "7-bit", and "7 bit" are
361 corrected to "7bit"; likewise for "8bit".
362
363 mime_type [DEFAULT]
364 Instance method. Try "real hard" to determine the content type
365 (e.g., "text/plain", "image/gif", "x-weird-type", which is returned
366 in all-lowercase. "Real hard" means that if no content type could
367 be found, the default (usually "text/plain") is returned. From RFC
368 2045 section 5.2:
369
370 Default RFC 822 messages without a MIME Content-Type header are
371 taken by this protocol to be plain text in the US-ASCII character
372 set, which can be explicitly specified as:
373
374 Content-type: text/plain; charset=us-ascii
375
376 This default is assumed if no Content-Type header field is specified.
377
378 Unless this is a part of a "multipart/digest", in which case
379 "message/rfc822" is the default. Note that you can also set the
380 default, but you shouldn't: normally only the MIME parser uses this
381 feature.
382
383 multipart_boundary
384 Instance method. If this is a header for a multipart message,
385 return the "encapsulation boundary" used to separate the parts.
386 The boundary is returned exactly as given in the "Content-type:"
387 field; that is, the leading double-hyphen ("--") is not prepended.
388
389 Well, almost exactly... this passage from RFC 2046 dictates that we
390 remove any trailing spaces:
391
392 If a boundary appears to end with white space, the white space
393 must be presumed to have been added by a gateway, and must be deleted.
394
395 Returns undef (not the empty string) if either the message is not
396 multipart or if there is no specified boundary.
397
398 recommended_filename
399 Instance method. Return the recommended external filename. This
400 is used when extracting the data from the MIME stream.
401
402 Returns undef if no filename could be suggested.
403
405 Why have separate objects for the entity, head, and body?
406 See the documentation for the MIME-tools distribution for the
407 rationale behind this decision.
408
409 Why assume that MIME headers are email headers?
410 I quote from Achim Bohnet, who gave feedback on v.1.9 (I think he's
411 using the word "header" where I would use "field"; e.g., to refer
412 to "Subject:", "Content-type:", etc.):
413
414 There is also IMHO no requirement [for] MIME::Heads to look
415 like [email] headers; so to speak, the MIME::Head [simply stores]
416 the attributes of a complex object, e.g.:
417
418 new MIME::Head type => "text/plain",
419 charset => ...,
420 disposition => ..., ... ;
421
422 I agree in principle, but (alas and dammit) RFC 2045 says
423 otherwise. RFC 2045 [MIME] headers are a syntactic subset of
424 RFC-822 [email] headers.
425
426 In my mind's eye, I see an abstract class, call it MIME::Attrs,
427 which does what Achim suggests... so you could say:
428
429 my $attrs = new MIME::Attrs type => "text/plain",
430 charset => ...,
431 disposition => ..., ... ;
432
433 We could even make it a superclass of MIME::Head: that way,
434 MIME::Head would have to implement its interface, and allow itself
435 to be initiallized from a MIME::Attrs object.
436
437 However, when you read RFC 2045, you begin to see how much MIME
438 information is organized by its presence in particular fields. I
439 imagine that we'd begin to mirror the structure of RFC 2045 fields
440 and subfields to such a degree that this might not give us a
441 tremendous gain over just having MIME::Head.
442
443 Why all this "occurence" and "index" jazz? Isn't every field unique?
444 Aaaaaaaaaahh....no.
445
446 Looking at a typical mail message header, it is sooooooo tempting
447 to just store the fields as a hash of strings, one string per hash
448 entry. Unfortunately, there's the little matter of the "Received:"
449 field, which (unlike "From:", "To:", etc.) will often have multiple
450 occurences; e.g.:
451
452 Received: from gsfc.nasa.gov by eryq.pr.mcs.net with smtp
453 (Linux Smail3.1.28.1 #5) id m0tStZ7-0007X4C;
454 Thu, 21 Dec 95 16:34 CST
455 Received: from rhine.gsfc.nasa.gov by gsfc.nasa.gov
456 (5.65/Ultrix3.0-C) id AA13596;
457 Thu, 21 Dec 95 17:20:38 -0500
458 Received: (from eryq@localhost) by rhine.gsfc.nasa.gov
459 (8.6.12/8.6.12) id RAA28069;
460 Thu, 21 Dec 1995 17:27:54 -0500
461 Date: Thu, 21 Dec 1995 17:27:54 -0500
462 From: Eryq <eryq@rhine.gsfc.nasa.gov>
463 Message-Id: <199512212227.RAA28069@rhine.gsfc.nasa.gov>
464 To: eryq@eryq.pr.mcs.net
465 Subject: Stuff and things
466
467 The "Received:" field is used for tracing message routes, and
468 although it's not generally used for anything other than human
469 debugging, I didn't want to inconvenience anyone who actually
470 wanted to get at that information.
471
472 I also didn't want to make this a special case; after all, who
473 knows what other fields could have multiple occurences in the
474 future? So, clearly, multiple entries had to somehow be stored
475 multiple times... and the different occurences had to be
476 retrievable.
477
479 Mail::Header, Mail::Field, MIME::Words, MIME::Tools
480
482 Eryq (eryq@zeegee.com), ZeeGee Software Inc (http://www.zeegee.com).
483 David F. Skoll (dfs@roaringpenguin.com) http://www.roaringpenguin.com
484
485 All rights reserved. This program is free software; you can
486 redistribute it and/or modify it under the same terms as Perl itself.
487
488 The more-comprehensive filename extraction is courtesy of Lee E.
489 Brotzman, Advanced Data Solutions.
490
491
492
493perl v5.10.1 2008-06-30 MIME::Head(3)