1MIME::Head(3)         User Contributed Perl Documentation        MIME::Head(3)
2
3
4

NAME

6       MIME::Head - MIME message header (a subclass of Mail::Header)
7

SYNOPSIS

9       Before reading further, you should see MIME::Tools to make sure that
10       you understand where this module fits into the grand scheme of things.
11       Go on, do it now.  I'll wait.
12
13       Ready?  Ok...
14
15   Construction
16           ### Create a new, empty header, and populate it manually:
17           $head = MIME::Head->new;
18           $head->replace('content-type', 'text/plain; charset=US-ASCII');
19           $head->replace('content-length', $len);
20
21           ### Parse a new header from a filehandle:
22           $head = MIME::Head->read(\*STDIN);
23
24           ### Parse a new header from a file, or a readable pipe:
25           $testhead = MIME::Head->from_file("/tmp/test.hdr");
26           $a_b_head = MIME::Head->from_file("cat a.hdr b.hdr |");
27
28   Output
29           ### Output to filehandle:
30           $head->print(\*STDOUT);
31
32           ### Output as string:
33           print STDOUT $head->as_string;
34           print STDOUT $head->stringify;
35
36   Getting field contents
37           ### Is this a reply?
38           $is_reply = 1 if ($head->get('Subject') =~ /^Re: /);
39
40           ### Get receipt information:
41           print "Last received from: ", $head->get('Received', 0);
42           @all_received = $head->get('Received');
43
44           ### Print the subject, or the empty string if none:
45           print "Subject: ", $head->get('Subject',0);
46
47           ### Too many hops?  Count 'em and see!
48           if ($head->count('Received') > 5) { ...
49
50           ### Test whether a given field exists
51           warn "missing subject!" if (! $head->count('subject'));
52
53   Setting field contents
54           ### Declare this to be an HTML header:
55           $head->replace('Content-type', 'text/html');
56
57   Manipulating field contents
58           ### Get rid of internal newlines in fields:
59           $head->unfold;
60
61           ### Decode any Q- or B-encoded-text in fields (DEPRECATED):
62           $head->decode;
63
64   Getting high-level MIME information
65           ### Get/set a given MIME attribute:
66           unless ($charset = $head->mime_attr('content-type.charset')) {
67               $head->mime_attr("content-type.charset" => "US-ASCII");
68           }
69
70           ### The content type (e.g., "text/html"):
71           $mime_type     = $head->mime_type;
72
73           ### The content transfer encoding (e.g., "quoted-printable"):
74           $mime_encoding = $head->mime_encoding;
75
76           ### The recommended name when extracted:
77           $file_name     = $head->recommended_filename;
78
79           ### The boundary text, for multipart messages:
80           $boundary      = $head->multipart_boundary;
81

DESCRIPTION

83       A class for parsing in and manipulating RFC-822 message headers, with
84       some methods geared towards standard (and not so standard) MIME fields
85       as specified in the various Multipurpose Internet Mail Extensions RFCs
86       (starting with RFC 2045)
87

PUBLIC INTERFACE

89   Creation, input, and output
90       new [ARG],[OPTIONS]
91           Class method, inherited.  Creates a new header object.  Arguments
92           are the same as those in the superclass.
93
94       from_file EXPR,OPTIONS
95           Class or instance method.  For convenience, you can use this to
96           parse a header object in from EXPR, which may actually be any
97           expression that can be sent to open() so as to return a readable
98           filehandle.  The "file" will be opened, read, and then closed:
99
100               ### Create a new header by parsing in a file:
101               my $head = MIME::Head->from_file("/tmp/test.hdr");
102
103           Since this method can function as either a class constructor or an
104           instance initializer, the above is exactly equivalent to:
105
106               ### Create a new header by parsing in a file:
107               my $head = MIME::Head->new->from_file("/tmp/test.hdr");
108
109           On success, the object will be returned; on failure, the undefined
110           value.
111
112           The OPTIONS are the same as in new(), and are passed into new() if
113           this is invoked as a class method.
114
115           Note: This is really just a convenience front-end onto "read()",
116           provided mostly for backwards-compatibility with MIME-parser 1.0.
117
118       read FILEHANDLE
119           Instance (or class) method.  This initializes a header object by
120           reading it in from a FILEHANDLE, until the terminating blank line
121           is encountered.  A syntax error or end-of-stream will also halt
122           processing.
123
124           Supply this routine with a reference to a filehandle glob; e.g.,
125           "\*STDIN":
126
127               ### Create a new header by parsing in STDIN:
128               $head->read(\*STDIN);
129
130           On success, the self object will be returned; on failure, a false
131           value.
132
133           Note: in the MIME world, it is perfectly legal for a header to be
134           empty, consisting of nothing but the terminating blank line.  Thus,
135           we can't just use the formula that "no tags equals error".
136
137           Warning: as of the time of this writing, Mail::Header::read did not
138           flag either syntax errors or unexpected end-of-file conditions (an
139           EOF before the terminating blank line).  MIME::ParserBase takes
140           this into account.
141
142   Getting/setting fields
143       The following are methods related to retrieving and modifying the
144       header fields.  Some are inherited from Mail::Header, but I've kept the
145       documentation around for convenience.
146
147       add TAG,TEXT,[INDEX]
148           Instance method, inherited.  Add a new occurrence of the field
149           named TAG, given by TEXT:
150
151               ### Add the trace information:
152               $head->add('Received',
153                          'from eryq.pr.mcs.net by gonzo.net with smtp');
154
155           Normally, the new occurrence will be appended to the existing
156           occurrences.  However, if the optional INDEX argument is 0, then
157           the new occurrence will be prepended.  If you want to be explicit
158           about appending, specify an INDEX of -1.
159
160           Warning: this method always adds new occurrences; it doesn't
161           overwrite any existing occurrences... so if you just want to change
162           the value of a field (creating it if necessary), then you probably
163           don't want to use this method: consider using "replace()" instead.
164
165       count TAG
166           Instance method, inherited.  Returns the number of occurrences of a
167           field; in a boolean context, this tells you whether a given field
168           exists:
169
170               ### Was a "Subject:" field given?
171               $subject_was_given = $head->count('subject');
172
173           The TAG is treated in a case-insensitive manner.  This method
174           returns some false value if the field doesn't exist, and some true
175           value if it does.
176
177       decode [FORCE]
178           Instance method, DEPRECATED.  Go through all the header fields,
179           looking for RFC 1522 / RFC 2047 style "Q" (quoted-printable, sort
180           of) or "B" (base64) encoding, and decode them in-place.  Fellow
181           Americans, you probably don't know what the hell I'm talking about.
182           Europeans, Russians, et al, you probably do.  ":-)".
183
184           This method has been deprecated.  See "decode_headers" in
185           MIME::Parser for the full reasons.  If you absolutely must use it
186           and don't like the warning, then provide a FORCE:
187
188              "I_NEED_TO_FIX_THIS"
189                     Just shut up and do it.  Not recommended.
190                     Provided only for those who need to keep old scripts functioning.
191
192              "I_KNOW_WHAT_I_AM_DOING"
193                     Just shut up and do it.  Not recommended.
194                     Provided for those who REALLY know what they are doing.
195
196           What this method does.  For an example, let's consider a valid
197           email header you might get:
198
199               From: =?US-ASCII?Q?Keith_Moore?= <moore@cs.utk.edu>
200               To: =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= <keld@dkuug.dk>
201               CC: =?ISO-8859-1?Q?Andr=E9_?= Pirard <PIRARD@vm1.ulg.ac.be>
202               Subject: =?ISO-8859-1?B?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?=
203                =?ISO-8859-2?B?dSB1bmRlcnN0YW5kIHRoZSBleGFtcGxlLg==?=
204                =?US-ASCII?Q?.._cool!?=
205
206           That basically decodes to (sorry, I can only approximate the Latin
207           characters with 7 bit sequences /o and 'e):
208
209               From: Keith Moore <moore@cs.utk.edu>
210               To: Keld J/orn Simonsen <keld@dkuug.dk>
211               CC: Andr'e  Pirard <PIRARD@vm1.ulg.ac.be>
212               Subject: If you can read this you understand the example... cool!
213
214           Note: currently, the decodings are done without regard to the
215           character set: thus, the Q-encoding "=F8" is simply translated to
216           the octet (hexadecimal "F8"), period.  For piece-by-piece decoding
217           of a given field, you want the array context of
218           "MIME::Words::decode_mimewords()".
219
220           Warning: the CRLF+SPACE separator that splits up long encoded words
221           into shorter sequences (see the Subject: example above) gets lost
222           when the field is unfolded, and so decoding after unfolding causes
223           a spurious space to be left in the field.  THEREFORE: if you're
224           going to decode, do so BEFORE unfolding!
225
226           This method returns the self object.
227
228           Thanks to Kent Boortz for providing the idea, and the baseline
229           RFC-1522-decoding code.
230
231       delete TAG,[INDEX]
232           Instance method, inherited.  Delete all occurrences of the field
233           named TAG.
234
235               ### Remove some MIME information:
236               $head->delete('MIME-Version');
237               $head->delete('Content-type');
238
239       get TAG,[INDEX]
240           Instance method, inherited.  Get the contents of field TAG.
241
242           If a numeric INDEX is given, returns the occurrence at that index,
243           or undef if not present:
244
245               ### Print the first and last 'Received:' entries (explicitly):
246               print "First, or most recent: ", $head->get('received', 0);
247               print "Last, or least recent: ", $head->get('received',-1);
248
249           If no INDEX is given, but invoked in a scalar context, then INDEX
250           simply defaults to 0:
251
252               ### Get the first 'Received:' entry (implicitly):
253               my $most_recent = $head->get('received');
254
255           If no INDEX is given, and invoked in an array context, then all
256           occurrences of the field are returned:
257
258               ### Get all 'Received:' entries:
259               my @all_received = $head->get('received');
260
261           NOTE: The header(s) returned may end with a newline.  If you don't
262           want this, then chomp the return value.
263
264       get_all FIELD
265           Instance method.  Returns the list of all occurrences of the field,
266           or the empty list if the field is not present:
267
268               ### How did it get here?
269               @history = $head->get_all('Received');
270
271           Note: I had originally experimented with having "get()" return all
272           occurrences when invoked in an array context... but that causes a
273           lot of accidents when you get careless and do stuff like this:
274
275               print "\u$field: ", $head->get($field);
276
277           It also made the intuitive behaviour unclear if the INDEX argument
278           was given in an array context.  So I opted for an explicit approach
279           to asking for all occurrences.
280
281       print [OUTSTREAM]
282           Instance method, override.  Print the header out to the given
283           OUTSTREAM, or the currently-selected filehandle if none.  The
284           OUTSTREAM may be a filehandle, or any object that responds to a
285           print() message.
286
287           The override actually lets you print to any object that responds to
288           a print() method.  This is vital for outputting MIME entities to
289           scalars.
290
291           Also, it defaults to the currently-selected filehandle if none is
292           given (not STDOUT!), so please supply a filehandle to prevent
293           confusion.
294
295       stringify
296           Instance method.  Return the header as a string.  You can also
297           invoke it as "as_string".
298
299           If you set the variable $MIME::Entity::BOUNDARY_DELIMITER to a
300           string, that string will be used as line-end delimiter.  If it is
301           not set, the line ending will be a newline character (\n)
302
303       unfold [FIELD]
304           Instance method, inherited.  Unfold (remove newlines in) the text
305           of all occurrences of the given FIELD.  If the FIELD is omitted,
306           all fields are unfolded.  Returns the "self" object.
307
308   MIME-specific methods
309       All of the following methods extract information from the following
310       fields:
311
312           Content-type
313           Content-transfer-encoding
314           Content-disposition
315
316       Be aware that they do not just return the raw contents of those fields,
317       and in some cases they will fill in sensible (I hope) default values.
318       Use "get()" or "mime_attr()" if you need to grab and process the raw
319       field text.
320
321       Note: some of these methods are provided both as a convenience and for
322       backwards-compatibility only, while others (like
323       recommended_filename()) really do have to be in MIME::Head to work
324       properly, since they look for their value in more than one field.
325       However, if you know that a value is restricted to a single field, you
326       should really use the Mail::Field interface to get it.
327
328       mime_attr ATTR,[VALUE]
329           A quick-and-easy interface to set/get the attributes in structured
330           MIME fields:
331
332               $head->mime_attr("content-type"         => "text/html");
333               $head->mime_attr("content-type.charset" => "US-ASCII");
334               $head->mime_attr("content-type.name"    => "homepage.html");
335
336           This would cause the final output to look something like this:
337
338               Content-type: text/html; charset=US-ASCII; name="homepage.html"
339
340           Note that the special empty sub-field tag indicates the anonymous
341           first sub-field.
342
343           Giving VALUE as undefined will cause the contents of the named
344           subfield to be deleted:
345
346               $head->mime_attr("content-type.charset" => undef);
347
348           Supplying no VALUE argument just returns the attribute's value, or
349           undefined if it isn't there:
350
351               $type = $head->mime_attr("content-type");      ### text/html
352               $name = $head->mime_attr("content-type.name"); ### homepage.html
353
354           In all cases, the new/current value is returned.
355
356       mime_encoding
357           Instance method.  Try real hard to determine the content transfer
358           encoding (e.g., "base64", "binary"), which is returned in all-
359           lowercase.
360
361           If no encoding could be found, the default of "7bit" is returned I
362           quote from RFC 2045 section 6.1:
363
364               This is the default value -- that is, "Content-Transfer-Encoding: 7BIT"
365               is assumed if the Content-Transfer-Encoding header field is not present.
366
367           I do one other form of fixup: "7_bit", "7-bit", and "7 bit" are
368           corrected to "7bit"; likewise for "8bit".
369
370       mime_type [DEFAULT]
371           Instance method.  Try "real hard" to determine the content type
372           (e.g., "text/plain", "image/gif", "x-weird-type", which is returned
373           in all-lowercase.  "Real hard" means that if no content type could
374           be found, the default (usually "text/plain") is returned.  From RFC
375           2045 section 5.2:
376
377              Default RFC 822 messages without a MIME Content-Type header are
378              taken by this protocol to be plain text in the US-ASCII character
379              set, which can be explicitly specified as:
380
381                 Content-type: text/plain; charset=us-ascii
382
383              This default is assumed if no Content-Type header field is specified.
384
385           Unless this is a part of a "multipart/digest", in which case
386           "message/rfc822" is the default.  Note that you can also set the
387           default, but you shouldn't: normally only the MIME parser uses this
388           feature.
389
390       multipart_boundary
391           Instance method.  If this is a header for a multipart message,
392           return the "encapsulation boundary" used to separate the parts.
393           The boundary is returned exactly as given in the "Content-type:"
394           field; that is, the leading double-hyphen ("--") is not prepended.
395
396           Well, almost exactly... this passage from RFC 2046 dictates that we
397           remove any trailing spaces:
398
399              If a boundary appears to end with white space, the white space
400              must be presumed to have been added by a gateway, and must be deleted.
401
402           Returns undef (not the empty string) if either the message is not
403           multipart or if there is no specified boundary.
404
405       recommended_filename
406           Instance method.  Return the recommended external filename.  This
407           is used when extracting the data from the MIME stream.  The
408           filename is always returned as a string in Perl's internal format
409           (the UTF8 flag may be on!)
410
411           Returns undef if no filename could be suggested.
412

NOTES

414       Why have separate objects for the entity, head, and body?
415           See the documentation for the MIME-tools distribution for the
416           rationale behind this decision.
417
418       Why assume that MIME headers are email headers?
419           I quote from Achim Bohnet, who gave feedback on v.1.9 (I think he's
420           using the word "header" where I would use "field"; e.g., to refer
421           to "Subject:", "Content-type:", etc.):
422
423               There is also IMHO no requirement [for] MIME::Heads to look
424               like [email] headers; so to speak, the MIME::Head [simply stores]
425               the attributes of a complex object, e.g.:
426
427                   new MIME::Head type => "text/plain",
428                                  charset => ...,
429                                  disposition => ..., ... ;
430
431           I agree in principle, but (alas and dammit) RFC 2045 says
432           otherwise.  RFC 2045 [MIME] headers are a syntactic subset of
433           RFC-822 [email] headers.
434
435           In my mind's eye, I see an abstract class, call it MIME::Attrs,
436           which does what Achim suggests... so you could say:
437
438                my $attrs = new MIME::Attrs type => "text/plain",
439                                            charset => ...,
440                                            disposition => ..., ... ;
441
442           We could even make it a superclass of MIME::Head: that way,
443           MIME::Head would have to implement its interface, and allow itself
444           to be initialized from a MIME::Attrs object.
445
446           However, when you read RFC 2045, you begin to see how much MIME
447           information is organized by its presence in particular fields.  I
448           imagine that we'd begin to mirror the structure of RFC 2045 fields
449           and subfields to such a degree that this might not give us a
450           tremendous gain over just having MIME::Head.
451
452       Why all this "occurrence" and "index" jazz?  Isn't every field unique?
453           Aaaaaaaaaahh....no.
454
455           Looking at a typical mail message header, it is sooooooo tempting
456           to just store the fields as a hash of strings, one string per hash
457           entry.  Unfortunately, there's the little matter of the "Received:"
458           field, which (unlike "From:", "To:", etc.) will often have multiple
459           occurrences; e.g.:
460
461               Received: from gsfc.nasa.gov by eryq.pr.mcs.net  with smtp
462                   (Linux Smail3.1.28.1 #5) id m0tStZ7-0007X4C;
463                    Thu, 21 Dec 95 16:34 CST
464               Received: from rhine.gsfc.nasa.gov by gsfc.nasa.gov
465                    (5.65/Ultrix3.0-C) id AA13596;
466                    Thu, 21 Dec 95 17:20:38 -0500
467               Received: (from eryq@localhost) by rhine.gsfc.nasa.gov
468                    (8.6.12/8.6.12) id RAA28069;
469                    Thu, 21 Dec 1995 17:27:54 -0500
470               Date: Thu, 21 Dec 1995 17:27:54 -0500
471               From: Eryq <eryq@rhine.gsfc.nasa.gov>
472               Message-Id: <199512212227.RAA28069@rhine.gsfc.nasa.gov>
473               To: eryq@eryq.pr.mcs.net
474               Subject: Stuff and things
475
476           The "Received:" field is used for tracing message routes, and
477           although it's not generally used for anything other than human
478           debugging, I didn't want to inconvenience anyone who actually
479           wanted to get at that information.
480
481           I also didn't want to make this a special case; after all, who
482           knows what other fields could have multiple occurrences in the
483           future?  So, clearly, multiple entries had to somehow be stored
484           multiple times... and the different occurrences had to be
485           retrievable.
486

SEE ALSO

488       Mail::Header, Mail::Field, MIME::Words, MIME::Tools
489

AUTHOR

491       Eryq (eryq@zeegee.com), ZeeGee Software Inc (http://www.zeegee.com).
492       Dianne Skoll (dfs@roaringpenguin.com) http://www.roaringpenguin.com
493
494       All rights reserved.  This program is free software; you can
495       redistribute it and/or modify it under the same terms as Perl itself.
496
497       The more-comprehensive filename extraction is courtesy of Lee E.
498       Brotzman, Advanced Data Solutions.
499
500
501
502perl v5.32.0                      2020-07-28                     MIME::Head(3)
Impressum