1MIME::Head(3) User Contributed Perl Documentation MIME::Head(3)
2
3
4
6 MIME::Head - MIME message header (a subclass of Mail::Header)
7
9 Before reading further, you should see MIME::Tools to make sure that
10 you understand where this module fits into the grand scheme of things.
11 Go on, do it now. I'll wait.
12
13 Ready? Ok...
14
15 Construction
16
17 ### Create a new, empty header, and populate it manually:
18 $head = MIME::Head->new;
19 $head->replace('content-type', 'text/plain; charset=US-ASCII');
20 $head->replace('content-length', $len);
21
22 ### Parse a new header from a filehandle:
23 $head = MIME::Head->read(\*STDIN);
24
25 ### Parse a new header from a file, or a readable pipe:
26 $testhead = MIME::Head->from_file("/tmp/test.hdr");
27 $a_b_head = MIME::Head->from_file("cat a.hdr b.hdr ⎪");
28
29 Output
30
31 ### Output to filehandle:
32 $head->print(\*STDOUT);
33
34 ### Output as string:
35 print STDOUT $head->as_string;
36 print STDOUT $head->stringify;
37
38 Getting field contents
39
40 ### Is this a reply?
41 $is_reply = 1 if ($head->get('Subject') =~ /^Re: /);
42
43 ### Get receipt information:
44 print "Last received from: ", $head->get('Received', 0), "\n";
45 @all_received = $head->get('Received');
46
47 ### Print the subject, or the empty string if none:
48 print "Subject: ", $head->get('Subject',0), "\n";
49
50 ### Too many hops? Count 'em and see!
51 if ($head->count('Received') > 5) { ...
52
53 ### Test whether a given field exists
54 warn "missing subject!" if (! $head->count('subject'));
55
56 Setting field contents
57
58 ### Declare this to be an HTML header:
59 $head->replace('Content-type', 'text/html');
60
61 Manipulating field contents
62
63 ### Get rid of internal newlines in fields:
64 $head->unfold;
65
66 ### Decode any Q- or B-encoded-text in fields (DEPRECATED):
67 $head->decode;
68
69 Getting high-level MIME information
70
71 ### Get/set a given MIME attribute:
72 unless ($charset = $head->mime_attr('content-type.charset')) {
73 $head->mime_attr("content-type.charset" => "US-ASCII");
74 }
75
76 ### The content type (e.g., "text/html"):
77 $mime_type = $head->mime_type;
78
79 ### The content transfer encoding (e.g., "quoted-printable"):
80 $mime_encoding = $head->mime_encoding;
81
82 ### The recommended name when extracted:
83 $file_name = $head->recommended_filename;
84
85 ### The boundary text, for multipart messages:
86 $boundary = $head->multipart_boundary;
87
89 A class for parsing in and manipulating RFC-822 message headers, with
90 some methods geared towards standard (and not so standard) MIME fields
91 as specified in RFC-1521, Multipurpose Internet Mail Extensions.
92
94 Creation, input, and output
95
96 new [ARG],[OPTIONS]
97 Class method, inherited. Creates a new header object. Arguments
98 are the same as those in the superclass.
99
100 from_file EXPR,OPTIONS
101 Class or instance method. For convenience, you can use this to
102 parse a header object in from EXPR, which may actually be any
103 expression that can be sent to open() so as to return a readable
104 filehandle. The "file" will be opened, read, and then closed:
105
106 ### Create a new header by parsing in a file:
107 my $head = MIME::Head->from_file("/tmp/test.hdr");
108
109 Since this method can function as either a class constructor or an
110 instance initializer, the above is exactly equivalent to:
111
112 ### Create a new header by parsing in a file:
113 my $head = MIME::Head->new->from_file("/tmp/test.hdr");
114
115 On success, the object will be returned; on failure, the undefined
116 value.
117
118 The OPTIONS are the same as in new(), and are passed into new() if
119 this is invoked as a class method.
120
121 Note: This is really just a convenience front-end onto "read()",
122 provided mostly for backwards-compatibility with MIME-parser 1.0.
123
124 read FILEHANDLE
125 Instance (or class) method. This initiallizes a header object by
126 reading it in from a FILEHANDLE, until the terminating blank line
127 is encountered. A syntax error or end-of-stream will also halt
128 processing.
129
130 Supply this routine with a reference to a filehandle glob; e.g.,
131 "\*STDIN":
132
133 ### Create a new header by parsing in STDIN:
134 $head->read(\*STDIN);
135
136 On success, the self object will be returned; on failure, a false
137 value.
138
139 Note: in the MIME world, it is perfectly legal for a header to be
140 empty, consisting of nothing but the terminating blank line. Thus,
141 we can't just use the formula that "no tags equals error".
142
143 Warning: as of the time of this writing, Mail::Header::read did not
144 flag either syntax errors or unexpected end-of-file conditions (an
145 EOF before the terminating blank line). MIME::ParserBase takes
146 this into account.
147
148 Getting/setting fields
149
150 The following are methods related to retrieving and modifying the
151 header fields. Some are inherited from Mail::Header, but I've kept the
152 documentation around for convenience.
153
154 add TAG,TEXT,[INDEX]
155 Instance method, inherited. Add a new occurence of the field named
156 TAG, given by TEXT:
157
158 ### Add the trace information:
159 $head->add('Received',
160 'from eryq.pr.mcs.net by gonzo.net with smtp');
161
162 Normally, the new occurence will be appended to the existing
163 occurences. However, if the optional INDEX argument is 0, then the
164 new occurence will be prepended. If you want to be explicit about
165 appending, specify an INDEX of -1.
166
167 Warning: this method always adds new occurences; it doesn't over‐
168 write any existing occurences... so if you just want to change the
169 value of a field (creating it if necessary), then you probably
170 don't want to use this method: consider using "replace()" instead.
171
172 count TAG
173 Instance method, inherited. Returns the number of occurences of a
174 field; in a boolean context, this tells you whether a given field
175 exists:
176
177 ### Was a "Subject:" field given?
178 $subject_was_given = $head->count('subject');
179
180 The TAG is treated in a case-insensitive manner. This method
181 returns some false value if the field doesn't exist, and some true
182 value if it does.
183
184 decode [FORCE]
185 Instance method, DEPRECATED. Go through all the header fields,
186 looking for RFC-1522-style "Q" (quoted-printable, sort of) or "B"
187 (base64) encoding, and decode them in-place. Fellow Americans, you
188 probably don't know what the hell I'm talking about. Europeans,
189 Russians, et al, you probably do. ":-)".
190
191 This method has been deprecated. See "decode_headers" in
192 MIME::Parser for the full reasons. If you absolutely must use it
193 and don't like the warning, then provide a FORCE:
194
195 "I_NEED_TO_FIX_THIS"
196 Just shut up and do it. Not recommended.
197 Provided only for those who need to keep old scripts functioning.
198
199 "I_KNOW_WHAT_I_AM_DOING"
200 Just shut up and do it. Not recommended.
201 Provided for those who REALLY know what they are doing.
202
203 What this method does. For an example, let's consider a valid
204 email header you might get:
205
206 From: =?US-ASCII?Q?Keith_Moore?= <moore@cs.utk.edu>
207 To: =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= <keld@dkuug.dk>
208 CC: =?ISO-8859-1?Q?Andr=E9_?= Pirard <PIRARD@vm1.ulg.ac.be>
209 Subject: =?ISO-8859-1?B?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?=
210 =?ISO-8859-2?B?dSB1bmRlcnN0YW5kIHRoZSBleGFtcGxlLg==?=
211 =?US-ASCII?Q?.._cool!?=
212
213 That basically decodes to (sorry, I can only approximate the Latin
214 characters with 7 bit sequences /o and 'e):
215
216 From: Keith Moore <moore@cs.utk.edu>
217 To: Keld J/orn Simonsen <keld@dkuug.dk>
218 CC: Andr'e Pirard <PIRARD@vm1.ulg.ac.be>
219 Subject: If you can read this you understand the example... cool!
220
221 Note: currently, the decodings are done without regard to the char‐
222 acter set: thus, the Q-encoding "=F8" is simply translated to the
223 octet (hexadecimal "F8"), period. For piece-by-piece decoding of a
224 given field, you want the array context of
225 "MIME::Word::decode_mimewords()".
226
227 Warning: the CRLF+SPACE separator that splits up long encoded words
228 into shorter sequences (see the Subject: example above) gets lost
229 when the field is unfolded, and so decoding after unfolding causes
230 a spurious space to be left in the field. THEREFORE: if you're
231 going to decode, do so BEFORE unfolding!
232
233 This method returns the self object.
234
235 Thanks to Kent Boortz for providing the idea, and the baseline
236 RFC-1522-decoding code.
237
238 delete TAG,[INDEX]
239 Instance method, inherited. Delete all occurences of the field
240 named TAG.
241
242 ### Remove some MIME information:
243 $head->delete('MIME-Version');
244 $head->delete('Content-type');
245
246 get TAG,[INDEX]
247 Instance method, inherited. Get the contents of field TAG.
248
249 If a numeric INDEX is given, returns the occurence at that index,
250 or undef if not present:
251
252 ### Print the first and last 'Received:' entries (explicitly):
253 print "First, or most recent: ", $head->get('received', 0), "\n";
254 print "Last, or least recent: ", $head->get('received',-1), "\n";
255
256 If no INDEX is given, but invoked in a scalar context, then INDEX
257 simply defaults to 0:
258
259 ### Get the first 'Received:' entry (implicitly):
260 my $most_recent = $head->get('received');
261
262 If no INDEX is given, and invoked in an array context, then all
263 occurences of the field are returned:
264
265 ### Get all 'Received:' entries:
266 my @all_received = $head->get('received');
267
268 get_all FIELD
269 Instance method. Returns the list of all occurences of the field,
270 or the empty list if the field is not present:
271
272 ### How did it get here?
273 @history = $head->get_all('Received');
274
275 Note: I had originally experimented with having "get()" return all
276 occurences when invoked in an array context... but that causes a
277 lot of accidents when you get careless and do stuff like this:
278
279 print "\u$field: ", $head->get($field), "\n";
280
281 It also made the intuitive behaviour unclear if the INDEX argument
282 was given in an array context. So I opted for an explicit approach
283 to asking for all occurences.
284
285 print [OUTSTREAM]
286 Instance method, override. Print the header out to the given OUT‐
287 STREAM, or the currently-selected filehandle if none. The OUT‐
288 STREAM may be a filehandle, or any object that responds to a
289 print() message.
290
291 The override actually lets you print to any object that responds to
292 a print() method. This is vital for outputting MIME entities to
293 scalars.
294
295 Also, it defaults to the currently-selected filehandle if none is
296 given (not STDOUT!), so please supply a filehandle to prevent con‐
297 fusion.
298
299 stringify
300 Instance method. Return the header as a string. You can also
301 invoke it as "as_string".
302
303 unfold [FIELD]
304 Instance method, inherited. Unfold (remove newlines in) the text
305 of all occurences of the given FIELD. If the FIELD is omitted, all
306 fields are unfolded. Returns the "self" object.
307
308 MIME-specific methods
309
310 All of the following methods extract information from the following
311 fields:
312
313 Content-type
314 Content-transfer-encoding
315 Content-disposition
316
317 Be aware that they do not just return the raw contents of those fields,
318 and in some cases they will fill in sensible (I hope) default values.
319 Use "get()" or "mime_attr()" if you need to grab and process the raw
320 field text.
321
322 Note: some of these methods are provided both as a convenience and for
323 backwards-compatibility only, while others (like recommended_file‐
324 name()) really do have to be in MIME::Head to work properly, since they
325 look for their value in more than one field. However, if you know that
326 a value is restricted to a single field, you should really use the
327 Mail::Field interface to get it.
328
329 mime_attr ATTR,[VALUE]
330 A quick-and-easy interface to set/get the attributes in structured
331 MIME fields:
332
333 $head->mime_attr("content-type" => "text/html");
334 $head->mime_attr("content-type.charset" => "US-ASCII");
335 $head->mime_attr("content-type.name" => "homepage.html");
336
337 This would cause the final output to look something like this:
338
339 Content-type: text/html; charset=US-ASCII; name="homepage.html"
340
341 Note that the special empty sub-field tag indicates the anonymous
342 first sub-field.
343
344 Giving VALUE as undefined will cause the contents of the named sub‐
345 field to be deleted:
346
347 $head->mime_attr("content-type.charset" => undef);
348
349 Supplying no VALUE argument just returns the attribute's value, or
350 undefined if it isn't there:
351
352 $type = $head->mime_attr("content-type"); ### text/html
353 $name = $head->mime_attr("content-type.name"); ### homepage.html
354
355 In all cases, the new/current value is returned.
356
357 mime_encoding
358 Instance method. Try real hard to determine the content transfer
359 encoding (e.g., "base64", "binary"), which is returned in all-low‐
360 ercase.
361
362 If no encoding could be found, the default of "7bit" is returned.
363 I quote from RFC-1521 section 5:
364
365 This is the default value -- that is, "Content-Transfer-Encoding: 7BIT"
366 is assumed if the Content-Transfer-Encoding header field is not present.
367
368 I do one other form of fixup: "7_bit", "7-bit", and "7 bit" are
369 corrected to "7bit"; likewise for "8bit".
370
371 mime_type [DEFAULT]
372 Instance method. Try "real hard" to determine the content type
373 (e.g., "text/plain", "image/gif", "x-weird-type", which is returned
374 in all-lowercase. "Real hard" means that if no content type could
375 be found, the default (usually "text/plain") is returned. From
376 RFC-1521 section 7.1:
377
378 The default Content-Type for Internet mail is
379 "text/plain; charset=us-ascii".
380
381 Unless this is a part of a "multipart/digest", in which case "mes‐
382 sage/rfc822" is the default. Note that you can also set the
383 default, but you shouldn't: normally only the MIME parser uses this
384 feature.
385
386 multipart_boundary
387 Instance method. If this is a header for a multipart message,
388 return the "encapsulation boundary" used to separate the parts.
389 The boundary is returned exactly as given in the "Content-type:"
390 field; that is, the leading double-hyphen ("--") is not prepended.
391
392 Well, almost exactly... this passage from RFC-1521 dictates that we
393 remove any trailing spaces:
394
395 If a boundary appears to end with white space, the white space
396 must be presumed to have been added by a gateway, and must be deleted.
397
398 Returns undef (not the empty string) if either the message is not
399 multipart or if there is no specified boundary.
400
401 recommended_filename
402 Instance method. Return the recommended external filename. This
403 is used when extracting the data from the MIME stream.
404
405 Returns undef if no filename could be suggested.
406
408 Why have separate objects for the entity, head, and body?
409 See the documentation for the MIME-tools distribution for the
410 rationale behind this decision.
411
412 Why assume that MIME headers are email headers?
413 I quote from Achim Bohnet, who gave feedback on v.1.9 (I think he's
414 using the word "header" where I would use "field"; e.g., to refer
415 to "Subject:", "Content-type:", etc.):
416
417 There is also IMHO no requirement [for] MIME::Heads to look
418 like [email] headers; so to speak, the MIME::Head [simply stores]
419 the attributes of a complex object, e.g.:
420
421 new MIME::Head type => "text/plain",
422 charset => ...,
423 disposition => ..., ... ;
424
425 I agree in principle, but (alas and dammit) RFC-1521 says other‐
426 wise. RFC-1521 [MIME] headers are a syntactic subset of RFC-822
427 [email] headers. Perhaps a better name for these modules would be
428 RFC1521:: instead of MIME::, but we're a little beyond that stage
429 now.
430
431 In my mind's eye, I see an abstract class, call it MIME::Attrs,
432 which does what Achim suggests... so you could say:
433
434 my $attrs = new MIME::Attrs type => "text/plain",
435 charset => ...,
436 disposition => ..., ... ;
437
438 We could even make it a superclass of MIME::Head: that way,
439 MIME::Head would have to implement its interface, and allow itself
440 to be initiallized from a MIME::Attrs object.
441
442 However, when you read RFC-1521, you begin to see how much MIME
443 information is organized by its presence in particular fields. I
444 imagine that we'd begin to mirror the structure of RFC-1521 fields
445 and subfields to such a degree that this might not give us a
446 tremendous gain over just having MIME::Head.
447
448 Why all this "occurence" and "index" jazz? Isn't every field unique?
449 Aaaaaaaaaahh....no.
450
451 Looking at a typical mail message header, it is sooooooo tempting
452 to just store the fields as a hash of strings, one string per hash
453 entry. Unfortunately, there's the little matter of the "Received:"
454 field, which (unlike "From:", "To:", etc.) will often have multiple
455 occurences; e.g.:
456
457 Received: from gsfc.nasa.gov by eryq.pr.mcs.net with smtp
458 (Linux Smail3.1.28.1 #5) id m0tStZ7-0007X4C;
459 Thu, 21 Dec 95 16:34 CST
460 Received: from rhine.gsfc.nasa.gov by gsfc.nasa.gov
461 (5.65/Ultrix3.0-C) id AA13596;
462 Thu, 21 Dec 95 17:20:38 -0500
463 Received: (from eryq@localhost) by rhine.gsfc.nasa.gov
464 (8.6.12/8.6.12) id RAA28069;
465 Thu, 21 Dec 1995 17:27:54 -0500
466 Date: Thu, 21 Dec 1995 17:27:54 -0500
467 From: Eryq <eryq@rhine.gsfc.nasa.gov>
468 Message-Id: <199512212227.RAA28069@rhine.gsfc.nasa.gov>
469 To: eryq@eryq.pr.mcs.net
470 Subject: Stuff and things
471
472 The "Received:" field is used for tracing message routes, and
473 although it's not generally used for anything other than human
474 debugging, I didn't want to inconvenience anyone who actually
475 wanted to get at that information.
476
477 I also didn't want to make this a special case; after all, who
478 knows what other fields could have multiple occurences in the
479 future? So, clearly, multiple entries had to somehow be stored
480 multiple times... and the different occurences had to be retriev‐
481 able.
482
484 Eryq (eryq@zeegee.com), ZeeGee Software Inc (http://www.zeegee.com).
485 David F. Skoll (dfs@roaringpenguin.com) http://www.roaringpenguin.com
486
487 All rights reserved. This program is free software; you can redis‐
488 tribute it and/or modify it under the same terms as Perl itself.
489
490 The more-comprehensive filename extraction is courtesy of Lee E. Brotz‐
491 man, Advanced Data Solutions.
492
494 $Revision: 1.14 $ $Date: 2006/03/17 21:03:23 $
495
496
497
498perl v5.8.8 2006-03-17 MIME::Head(3)