1Mail::Mbox::MessageParsUesre(r3)Contributed Perl DocumenMtaaitli:o:nMbox::MessageParser(3)
2
3
4

NAME

6       Mail::Mbox::MessageParser - A fast and simple mbox folder reader
7

SYNOPSIS

9         #!/usr/bin/perl
10
11         use Mail::Mbox::MessageParser;
12
13         my $file_name = 'mail/saved-mail';
14         my $file_handle = new FileHandle($file_name);
15
16         # Set up cache. (Not necessary if enable_cache is false.)
17         Mail::Mbox::MessageParser::SETUP_CACHE(
18           { 'file_name' => '/tmp/cache' } );
19
20         my $folder_reader =
21           new Mail::Mbox::MessageParser( {
22             'file_name' => $file_name,
23             'file_handle' => $file_handle,
24             'enable_cache' => 1,
25             'enable_grep' => 1,
26           } );
27
28         die $folder_reader unless ref $folder_reader;
29
30         # Any newlines or such before the start of the first email
31         my $prologue = $folder_reader->prologue;
32         print $prologue;
33
34         # This is the main loop. It's executed once for each email
35         while(!$folder_reader->end_of_file())
36         {
37           my $email = $folder_reader->read_next_email();
38           print $$email;
39         }
40

DESCRIPTION

42       This module implements a fast but simple mbox folder reader. One of
43       three implementations (Cache, Grep, Perl) will be used depending on the
44       wishes of the user and the system configuration. The first implementa‐
45       tion is a cached-based one which stores email information about mail‐
46       boxes on the file system.  Subsequent accesses will be faster because
47       no analysis of the mailbox will be needed. The second implementation is
48       one based on GNU grep, and is significantly faster than the Perl ver‐
49       sion for mailboxes which contain very large (10MB) emails. The final
50       implementation is a fast Perl-based one which should always be applica‐
51       ble.
52
53       The Cache implementation is about 6 times faster than the standard Perl
54       implementation. The Grep implementation is about 4 times faster than
55       the standard Perl implementation. If you have GNU grep, it's best to
56       enable both the Cache and Grep implementations. If the cache informa‐
57       tion is available, you'll get very fast speeds. Otherwise, you'll take
58       about a 1/3 performance hit when the Grep version is used instead.
59
60       The overriding requirement for this module is speed. If you wish more
61       sophisticated parsing, use Mail::MboxParser (which is based on this
62       module) or Mail::Box.
63
64       METHODS AND FUNCTIONS
65
66       SETUP_CACHE(...)
67             SETUP_CACHE( { 'file_name' => <cache file name> } );
68
69             <cache file name> - the file name of the cache
70
71           Call this function once to set up the cache before creating any
72           parsers. You must provide the location to the cache file. There is
73           no default value.
74
75       new(...)
76             new( { 'file_name' => <mailbox file name>,
77               'file_handle' => <mailbox file handle>,
78               'enable_cache' => <1 or 0>,
79               'enable_grep' => <1 or 0>,
80               'force_processing' => <1 or 0>,
81               'debug' => <1 or 0>,
82             } );
83
84             <mailbox file name> - the file name of the mailbox
85             <mailbox file handle> - the already opened file handle for the mailbox
86             <enable_cache> - true to attempt to use the cache implementation
87             <enable_grep> - true to attempt to use the grep implementation
88             <force_processing> - true to force processing of files that look invalid
89             <debug> - true to print some debugging information to STDERR
90
91           The constructor takes either a file name or a file handle, or both.
92           If the file handle is not defined, Mail::Mbox::MessageParser will
93           attempt to open the file using the file name. You should always
94           pass the file name if you have it, so that the parser can cache the
95           mailbox information.
96
97           This module will automatically decompress the mailbox as necessary.
98           If a filename is available but the file handle is undef, the module
99           will call either bzip2, or gzip to decompress the file in memory if
100           the filename ends with .tz, .bz2, or .gz, respectively. If the file
101           handle is defined, it will detect the type of compression and apply
102           the correct decompression program.
103
104           The Cache, Grep, or Perl implementation of the parser will be
105           loaded, whichever is most appropriate. For example, the first time
106           you use caching, there will be no cache. In this case, the grep
107           implementation can be used instead. The cache will be updated in
108           memory as the grep implementation parses the mailbox, and the cache
109           will be written after the program exits. The file name is optional,
110           in which case enable_cache and enable_grep must both be false.
111
112           force_processing will cause the module to process folders that look
113           to be binary, or whose text data doesn't look like a mailbox.
114
115           Returns a reference to a Mail::Mbox::MessageParser object on suc‐
116           cess, and a scalar desribing an error on failure. ("Not a mailbox",
117           "Can't open <filename>: <system error>", "Can't execute <uncompress
118           command> for file <filename>"
119
120       reset()
121           Reset the filehandle and all internal state. Note that this will
122           not work with filehandles which are streams. If there is enough
123           demand, I may add the ability to store the previously read stream
124           data internally so that reset() will work correctly.
125
126       endline()
127           Returns "\n" or "\r\n", depending on the file format.
128
129       prologue()
130           Returns any newlines or other content at the start of the mailbox
131           prior to the first email.
132
133       end_of_file()
134           Returns true if the end of the file has been encountered.
135
136       line_number()
137           Returns the line number for the start of the last email read.
138
139       number()
140           Returns the number of the last email read. (i.e. The first email
141           will have a number of 1.)
142
143       length()
144           Returns the length of the last email read.
145
146       offset()
147           Returns the byte offset of the last email read.
148
149       read_next_email()
150           Returns a reference to a scalar holding the text of the next email
151           in the mailbox, or undef at the end of the file.
152

BUGS

154       No known bugs.
155
156       Contact david@coppit.org for bug reports and suggestions.
157

AUTHOR

159       David Coppit <david@coppit.org>.
160

LICENSE

162       This software is distributed under the terms of the GPL. See the file
163       "LICENSE" for more information.
164

HISTORY

166       This code was originally part of the grepmail distribution. See
167       http://grepmail.sf.net/ for previous versions of grepmail which
168       included early versions of this code.
169

SEE ALSO

171       Mail::MboxParser, Mail::Box
172
173
174
175perl v5.8.8                       2007-01-11      Mail::Mbox::MessageParser(3)
Impressum