1MIME::Head(3) User Contributed Perl Documentation MIME::Head(3)
2
3
4
6 MIME::Head - MIME message header (a subclass of Mail::Header)
7
9 Before reading further, you should see MIME::Tools to make sure that
10 you understand where this module fits into the grand scheme of things.
11 Go on, do it now. I'll wait.
12
13 Ready? Ok...
14
15 Construction
16 ### Create a new, empty header, and populate it manually:
17 $head = MIME::Head->new;
18 $head->replace('content-type', 'text/plain; charset=US-ASCII');
19 $head->replace('content-length', $len);
20
21 ### Parse a new header from a filehandle:
22 $head = MIME::Head->read(\*STDIN);
23
24 ### Parse a new header from a file, or a readable pipe:
25 $testhead = MIME::Head->from_file("/tmp/test.hdr");
26 $a_b_head = MIME::Head->from_file("cat a.hdr b.hdr |");
27
28 Output
29 ### Output to filehandle:
30 $head->print(\*STDOUT);
31
32 ### Output as string:
33 print STDOUT $head->as_string;
34 print STDOUT $head->stringify;
35
36 Getting field contents
37 ### Is this a reply?
38 $is_reply = 1 if ($head->get('Subject') =~ /^Re: /);
39
40 ### Get receipt information:
41 print "Last received from: ", $head->get('Received', 0);
42 @all_received = $head->get('Received');
43
44 ### Print the subject, or the empty string if none:
45 print "Subject: ", $head->get('Subject',0);
46
47 ### Too many hops? Count 'em and see!
48 if ($head->count('Received') > 5) { ...
49
50 ### Test whether a given field exists
51 warn "missing subject!" if (! $head->count('subject'));
52
53 Setting field contents
54 ### Declare this to be an HTML header:
55 $head->replace('Content-type', 'text/html');
56
57 Manipulating field contents
58 ### Get rid of internal newlines in fields:
59 $head->unfold;
60
61 ### Decode any Q- or B-encoded-text in fields (DEPRECATED):
62 $head->decode;
63
64 Getting high-level MIME information
65 ### Get/set a given MIME attribute:
66 unless ($charset = $head->mime_attr('content-type.charset')) {
67 $head->mime_attr("content-type.charset" => "US-ASCII");
68 }
69
70 ### The content type (e.g., "text/html"):
71 $mime_type = $head->mime_type;
72
73 ### The content transfer encoding (e.g., "quoted-printable"):
74 $mime_encoding = $head->mime_encoding;
75
76 ### The recommended name when extracted:
77 $file_name = $head->recommended_filename;
78
79 ### The boundary text, for multipart messages:
80 $boundary = $head->multipart_boundary;
81
83 A class for parsing in and manipulating RFC-822 message headers, with
84 some methods geared towards standard (and not so standard) MIME fields
85 as specified in the various Multipurpose Internet Mail Extensions RFCs
86 (starting with RFC 2045)
87
89 Creation, input, and output
90 new [ARG],[OPTIONS]
91 Class method, inherited. Creates a new header object. Arguments
92 are the same as those in the superclass.
93
94 from_file EXPR,OPTIONS
95 Class or instance method. For convenience, you can use this to
96 parse a header object in from EXPR, which may actually be any
97 expression that can be sent to open() so as to return a readable
98 filehandle. The "file" will be opened, read, and then closed:
99
100 ### Create a new header by parsing in a file:
101 my $head = MIME::Head->from_file("/tmp/test.hdr");
102
103 Since this method can function as either a class constructor or an
104 instance initializer, the above is exactly equivalent to:
105
106 ### Create a new header by parsing in a file:
107 my $head = MIME::Head->new->from_file("/tmp/test.hdr");
108
109 On success, the object will be returned; on failure, the undefined
110 value.
111
112 The OPTIONS are the same as in new(), and are passed into new() if
113 this is invoked as a class method.
114
115 Note: This is really just a convenience front-end onto "read()",
116 provided mostly for backwards-compatibility with MIME-parser 1.0.
117
118 read FILEHANDLE
119 Instance (or class) method. This initializes a header object by
120 reading it in from a FILEHANDLE, until the terminating blank line
121 is encountered. A syntax error or end-of-stream will also halt
122 processing.
123
124 Supply this routine with a reference to a filehandle glob; e.g.,
125 "\*STDIN":
126
127 ### Create a new header by parsing in STDIN:
128 $head->read(\*STDIN);
129
130 On success, the self object will be returned; on failure, a false
131 value.
132
133 Note: in the MIME world, it is perfectly legal for a header to be
134 empty, consisting of nothing but the terminating blank line. Thus,
135 we can't just use the formula that "no tags equals error".
136
137 Warning: as of the time of this writing, Mail::Header::read did not
138 flag either syntax errors or unexpected end-of-file conditions (an
139 EOF before the terminating blank line). MIME::ParserBase takes
140 this into account.
141
142 Getting/setting fields
143 The following are methods related to retrieving and modifying the
144 header fields. Some are inherited from Mail::Header, but I've kept the
145 documentation around for convenience.
146
147 add TAG,TEXT,[INDEX]
148 Instance method, inherited. Add a new occurrence of the field
149 named TAG, given by TEXT:
150
151 ### Add the trace information:
152 $head->add('Received',
153 'from eryq.pr.mcs.net by gonzo.net with smtp');
154
155 Normally, the new occurrence will be appended to the existing
156 occurrences. However, if the optional INDEX argument is 0, then
157 the new occurrence will be prepended. If you want to be explicit
158 about appending, specify an INDEX of -1.
159
160 Warning: this method always adds new occurrences; it doesn't
161 overwrite any existing occurrences... so if you just want to change
162 the value of a field (creating it if necessary), then you probably
163 don't want to use this method: consider using "replace()" instead.
164
165 count TAG
166 Instance method, inherited. Returns the number of occurrences of a
167 field; in a boolean context, this tells you whether a given field
168 exists:
169
170 ### Was a "Subject:" field given?
171 $subject_was_given = $head->count('subject');
172
173 The TAG is treated in a case-insensitive manner. This method
174 returns some false value if the field doesn't exist, and some true
175 value if it does.
176
177 decode [FORCE]
178 Instance method, DEPRECATED. Go through all the header fields,
179 looking for RFC 1522 / RFC 2047 style "Q" (quoted-printable, sort
180 of) or "B" (base64) encoding, and decode them in-place. Fellow
181 Americans, you probably don't know what the hell I'm talking about.
182 Europeans, Russians, et al, you probably do. ":-)".
183
184 This method has been deprecated. See "decode_headers" in
185 MIME::Parser for the full reasons. If you absolutely must use it
186 and don't like the warning, then provide a FORCE:
187
188 "I_NEED_TO_FIX_THIS"
189 Just shut up and do it. Not recommended.
190 Provided only for those who need to keep old scripts functioning.
191
192 "I_KNOW_WHAT_I_AM_DOING"
193 Just shut up and do it. Not recommended.
194 Provided for those who REALLY know what they are doing.
195
196 What this method does. For an example, let's consider a valid
197 email header you might get:
198
199 From: =?US-ASCII?Q?Keith_Moore?= <moore@cs.utk.edu>
200 To: =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= <keld@dkuug.dk>
201 CC: =?ISO-8859-1?Q?Andr=E9_?= Pirard <PIRARD@vm1.ulg.ac.be>
202 Subject: =?ISO-8859-1?B?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?=
203 =?ISO-8859-2?B?dSB1bmRlcnN0YW5kIHRoZSBleGFtcGxlLg==?=
204 =?US-ASCII?Q?.._cool!?=
205
206 That basically decodes to (sorry, I can only approximate the Latin
207 characters with 7 bit sequences /o and 'e):
208
209 From: Keith Moore <moore@cs.utk.edu>
210 To: Keld J/orn Simonsen <keld@dkuug.dk>
211 CC: Andr'e Pirard <PIRARD@vm1.ulg.ac.be>
212 Subject: If you can read this you understand the example... cool!
213
214 Note: currently, the decodings are done without regard to the
215 character set: thus, the Q-encoding "=F8" is simply translated to
216 the octet (hexadecimal "F8"), period. For piece-by-piece decoding
217 of a given field, you want the array context of
218 "MIME::Words::decode_mimewords()".
219
220 Warning: the CRLF+SPACE separator that splits up long encoded words
221 into shorter sequences (see the Subject: example above) gets lost
222 when the field is unfolded, and so decoding after unfolding causes
223 a spurious space to be left in the field. THEREFORE: if you're
224 going to decode, do so BEFORE unfolding!
225
226 This method returns the self object.
227
228 Thanks to Kent Boortz for providing the idea, and the baseline
229 RFC-1522-decoding code.
230
231 delete TAG,[INDEX]
232 Instance method, inherited. Delete all occurrences of the field
233 named TAG.
234
235 ### Remove some MIME information:
236 $head->delete('MIME-Version');
237 $head->delete('Content-type');
238
239 get TAG,[INDEX]
240 Instance method, inherited. Get the contents of field TAG.
241
242 If a numeric INDEX is given, returns the occurrence at that index,
243 or undef if not present:
244
245 ### Print the first and last 'Received:' entries (explicitly):
246 print "First, or most recent: ", $head->get('received', 0);
247 print "Last, or least recent: ", $head->get('received',-1);
248
249 If no INDEX is given, but invoked in a scalar context, then INDEX
250 simply defaults to 0:
251
252 ### Get the first 'Received:' entry (implicitly):
253 my $most_recent = $head->get('received');
254
255 If no INDEX is given, and invoked in an array context, then all
256 occurrences of the field are returned:
257
258 ### Get all 'Received:' entries:
259 my @all_received = $head->get('received');
260
261 NOTE: The header(s) returned may end with a newline. If you don't
262 want this, then chomp the return value.
263
264 get_all FIELD
265 Instance method. Returns the list of all occurrences of the field,
266 or the empty list if the field is not present:
267
268 ### How did it get here?
269 @history = $head->get_all('Received');
270
271 Note: I had originally experimented with having "get()" return all
272 occurrences when invoked in an array context... but that causes a
273 lot of accidents when you get careless and do stuff like this:
274
275 print "\u$field: ", $head->get($field);
276
277 It also made the intuitive behaviour unclear if the INDEX argument
278 was given in an array context. So I opted for an explicit approach
279 to asking for all occurrences.
280
281 print [OUTSTREAM]
282 Instance method, override. Print the header out to the given
283 OUTSTREAM, or the currently-selected filehandle if none. The
284 OUTSTREAM may be a filehandle, or any object that responds to a
285 print() message.
286
287 The override actually lets you print to any object that responds to
288 a print() method. This is vital for outputting MIME entities to
289 scalars.
290
291 Also, it defaults to the currently-selected filehandle if none is
292 given (not STDOUT!), so please supply a filehandle to prevent
293 confusion.
294
295 stringify
296 Instance method. Return the header as a string. You can also
297 invoke it as "as_string".
298
299 If you set the variable $MIME::Entity::BOUNDARY_DELIMITER to a
300 string, that string will be used as line-end delimiter. If it is
301 not set, the line ending will be a newline character (\n)
302
303 unfold [FIELD]
304 Instance method, inherited. Unfold (remove newlines in) the text
305 of all occurrences of the given FIELD. If the FIELD is omitted,
306 all fields are unfolded. Returns the "self" object.
307
308 MIME-specific methods
309 All of the following methods extract information from the following
310 fields:
311
312 Content-type
313 Content-transfer-encoding
314 Content-disposition
315
316 Be aware that they do not just return the raw contents of those fields,
317 and in some cases they will fill in sensible (I hope) default values.
318 Use "get()" or "mime_attr()" if you need to grab and process the raw
319 field text.
320
321 Note: some of these methods are provided both as a convenience and for
322 backwards-compatibility only, while others (like
323 recommended_filename()) really do have to be in MIME::Head to work
324 properly, since they look for their value in more than one field.
325 However, if you know that a value is restricted to a single field, you
326 should really use the Mail::Field interface to get it.
327
328 mime_attr ATTR,[VALUE]
329 A quick-and-easy interface to set/get the attributes in structured
330 MIME fields:
331
332 $head->mime_attr("content-type" => "text/html");
333 $head->mime_attr("content-type.charset" => "US-ASCII");
334 $head->mime_attr("content-type.name" => "homepage.html");
335
336 This would cause the final output to look something like this:
337
338 Content-type: text/html; charset=US-ASCII; name="homepage.html"
339
340 Note that the special empty sub-field tag indicates the anonymous
341 first sub-field.
342
343 Giving VALUE as undefined will cause the contents of the named
344 subfield to be deleted:
345
346 $head->mime_attr("content-type.charset" => undef);
347
348 Supplying no VALUE argument just returns the attribute's value, or
349 undefined if it isn't there:
350
351 $type = $head->mime_attr("content-type"); ### text/html
352 $name = $head->mime_attr("content-type.name"); ### homepage.html
353
354 In all cases, the new/current value is returned.
355
356 mime_encoding
357 Instance method. Try real hard to determine the content transfer
358 encoding (e.g., "base64", "binary"), which is returned in all-
359 lowercase.
360
361 If no encoding could be found, the default of "7bit" is returned I
362 quote from RFC 2045 section 6.1:
363
364 This is the default value -- that is, "Content-Transfer-Encoding: 7BIT"
365 is assumed if the Content-Transfer-Encoding header field is not present.
366
367 I do one other form of fixup: "7_bit", "7-bit", and "7 bit" are
368 corrected to "7bit"; likewise for "8bit".
369
370 mime_type [DEFAULT]
371 Instance method. Try "real hard" to determine the content type
372 (e.g., "text/plain", "image/gif", "x-weird-type", which is returned
373 in all-lowercase. "Real hard" means that if no content type could
374 be found, the default (usually "text/plain") is returned. From RFC
375 2045 section 5.2:
376
377 Default RFC 822 messages without a MIME Content-Type header are
378 taken by this protocol to be plain text in the US-ASCII character
379 set, which can be explicitly specified as:
380
381 Content-type: text/plain; charset=us-ascii
382
383 This default is assumed if no Content-Type header field is specified.
384
385 Unless this is a part of a "multipart/digest", in which case
386 "message/rfc822" is the default. Note that you can also set the
387 default, but you shouldn't: normally only the MIME parser uses this
388 feature.
389
390 multipart_boundary
391 Instance method. If this is a header for a multipart message,
392 return the "encapsulation boundary" used to separate the parts.
393 The boundary is returned exactly as given in the "Content-type:"
394 field; that is, the leading double-hyphen ("--") is not prepended.
395
396 Well, almost exactly... this passage from RFC 2046 dictates that we
397 remove any trailing spaces:
398
399 If a boundary appears to end with white space, the white space
400 must be presumed to have been added by a gateway, and must be deleted.
401
402 Returns undef (not the empty string) if either the message is not
403 multipart or if there is no specified boundary.
404
405 recommended_filename
406 Instance method. Return the recommended external filename. This
407 is used when extracting the data from the MIME stream. The
408 filename is always returned as a string in Perl's internal format
409 (the UTF8 flag may be on!)
410
411 Returns undef if no filename could be suggested.
412
414 Why have separate objects for the entity, head, and body?
415 See the documentation for the MIME-tools distribution for the
416 rationale behind this decision.
417
418 Why assume that MIME headers are email headers?
419 I quote from Achim Bohnet, who gave feedback on v.1.9 (I think he's
420 using the word "header" where I would use "field"; e.g., to refer
421 to "Subject:", "Content-type:", etc.):
422
423 There is also IMHO no requirement [for] MIME::Heads to look
424 like [email] headers; so to speak, the MIME::Head [simply stores]
425 the attributes of a complex object, e.g.:
426
427 new MIME::Head type => "text/plain",
428 charset => ...,
429 disposition => ..., ... ;
430
431 I agree in principle, but (alas and dammit) RFC 2045 says
432 otherwise. RFC 2045 [MIME] headers are a syntactic subset of
433 RFC-822 [email] headers.
434
435 In my mind's eye, I see an abstract class, call it MIME::Attrs,
436 which does what Achim suggests... so you could say:
437
438 my $attrs = new MIME::Attrs type => "text/plain",
439 charset => ...,
440 disposition => ..., ... ;
441
442 We could even make it a superclass of MIME::Head: that way,
443 MIME::Head would have to implement its interface, and allow itself
444 to be initialized from a MIME::Attrs object.
445
446 However, when you read RFC 2045, you begin to see how much MIME
447 information is organized by its presence in particular fields. I
448 imagine that we'd begin to mirror the structure of RFC 2045 fields
449 and subfields to such a degree that this might not give us a
450 tremendous gain over just having MIME::Head.
451
452 Why all this "occurrence" and "index" jazz? Isn't every field unique?
453 Aaaaaaaaaahh....no.
454
455 Looking at a typical mail message header, it is sooooooo tempting
456 to just store the fields as a hash of strings, one string per hash
457 entry. Unfortunately, there's the little matter of the "Received:"
458 field, which (unlike "From:", "To:", etc.) will often have multiple
459 occurrences; e.g.:
460
461 Received: from gsfc.nasa.gov by eryq.pr.mcs.net with smtp
462 (Linux Smail3.1.28.1 #5) id m0tStZ7-0007X4C;
463 Thu, 21 Dec 95 16:34 CST
464 Received: from rhine.gsfc.nasa.gov by gsfc.nasa.gov
465 (5.65/Ultrix3.0-C) id AA13596;
466 Thu, 21 Dec 95 17:20:38 -0500
467 Received: (from eryq@localhost) by rhine.gsfc.nasa.gov
468 (8.6.12/8.6.12) id RAA28069;
469 Thu, 21 Dec 1995 17:27:54 -0500
470 Date: Thu, 21 Dec 1995 17:27:54 -0500
471 From: Eryq <eryq@rhine.gsfc.nasa.gov>
472 Message-Id: <199512212227.RAA28069@rhine.gsfc.nasa.gov>
473 To: eryq@eryq.pr.mcs.net
474 Subject: Stuff and things
475
476 The "Received:" field is used for tracing message routes, and
477 although it's not generally used for anything other than human
478 debugging, I didn't want to inconvenience anyone who actually
479 wanted to get at that information.
480
481 I also didn't want to make this a special case; after all, who
482 knows what other fields could have multiple occurrences in the
483 future? So, clearly, multiple entries had to somehow be stored
484 multiple times... and the different occurrences had to be
485 retrievable.
486
488 Mail::Header, Mail::Field, MIME::Words, MIME::Tools
489
491 Eryq (eryq@zeegee.com), ZeeGee Software Inc (http://www.zeegee.com).
492 Dianne Skoll (dfs@roaringpenguin.com) http://www.roaringpenguin.com
493
494 All rights reserved. This program is free software; you can
495 redistribute it and/or modify it under the same terms as Perl itself.
496
497 The more-comprehensive filename extraction is courtesy of Lee E.
498 Brotzman, Advanced Data Solutions.
499
500
501
502perl v5.34.0 2022-01-21 MIME::Head(3)