1GREPMAIL(1)           User Contributed Perl Documentation          GREPMAIL(1)
2
3
4

NAME

6       grepmail - search mailboxes for mail matching a regular expression
7

SYNOPSIS

9         grepmail [--help|--version] [-abBDFhHilLmrRuvVw] [-C <cache-file>]
10           [-j <status>] [-s <sizespec>] [-d <date-specification>]
11           [-X <signature-pattern>] [-Y <header-pattern>]
12           [[-e] <pattern>|-E <expr>|-f <pattern-file>] <files...>
13

DESCRIPTION

15         grepmail looks for mail messages containing a pattern, and prints the
16         resulting messages on standard out.
17
18         By default grepmail looks in both header and body for the specified
19         pattern.
20
21         When redirected to a file, the result is another mailbox, which can,
22         in turn, be handled by standard User Agents, such as elm, or even
23         used as input for another instance of grepmail.
24
25         At least one of -E, -e, -d, -s, or -u must be specified. The pattern
26         is optional if -d, -s, and/or -u is used. The -e flag is optional if
27         there is no file whose name is the pattern. The -E option can be used
28         to specify complex search expressions involving logical operators.
29         (See below.)
30
31         If a mailbox can not be found, grepmail first searches the directory
32         specified by the MAILDIR environment variable (if one is defined),
33         then searches the $HOME/mail, $HOME/Mail, and $HOME/Mailbox
34         directories.
35

OPTIONS AND ARGUMENTS

37       Many of the options and arguments are analogous to those of grep.
38
39       pattern
40         The pattern to search for in the mail message.  May be any Perl
41         regular expression, but should be quoted on the command line to
42         protect against globbing (shell expansion). To search for more than
43         one pattern, use the form "(pattern1|pattern2|...)".
44
45         Note that complex pattern features such as "(?>...)" require that you
46         use a version of perl which supports them. You can use the pattern
47         "()" to indicate that you do not want to match anything. This is
48         useful if you want to initialize the cache without printing any
49         output.
50
51       mailbox
52         Mailboxes must be traditional, UNIX "/bin/mail" mailbox format.  The
53         mailboxes may be compressed by gzip, or bzip2, in which case gunzip,
54         or bzip2 must be installed on the system.
55
56         If no mailbox is specified, takes input from stdin, which can be
57         compressed or not. grepmail's behavior is undefined when ASCII and
58         binary data is piped together as input.
59
60       -a
61         Use arrival date instead of sent date.
62
63       -b
64         Asserts that the pattern must match in the body of the email.
65
66       -B
67         Print the body but with only minimal ('From ', 'From:', 'Subject:',
68         'Date:') headers. This flag can be used with -H, in which case it
69         will print only short headers and no email bodies.
70
71       -C
72         Specifies the location of the cache file. The default is
73         $HOME/.grepmail-cache.
74
75       -D
76         Enable debug mode, which prints diagnostic messages.
77
78       -d
79         Date specifications must be of the form of:
80           - a date like "today", "yesterday", "5/18/93", "5 days ago", "5
81         weeks ago",
82           - OR "before", "after", or "since", followed by a date as defined
83         above,
84           - OR "between <date> and <date>", where <date> is defined as above.
85
86         Simple date expressions will first be parsed by Date::Parse. If this
87         fails, grepmail will attempt to parse the date with Date::Manip, if
88         the module is installed on the system. Use an empty pattern (i.e. -d
89         "") to find emails without a "Date: ..." line in the header.
90
91         Date specifications without times are interpreted as having a time of
92         midnight of that day (which is the morning), except for "after" and
93         "since" specifications, which are interpreted as midnight of the
94         following day.  For example, "between today and tomorrow" is the same
95         as simply "today", and returns emails whose date has the current day.
96         ("now" is interpreted as "today".) The date specification "after July
97         5th" will return emails whose date is midnight July 6th or later.
98
99       -E
100         Specify a complex search expression using logical operators. The
101         current syntax allows the user to specify search expressions using
102         Perl syntax. Three values can be used: $email (the entire email
103         message), $email_header (just the header), or $email_body (just the
104         body). A search is specified in the form "$email =~ /pattern/", and
105         multiple searches can be combined using "&&" and "||" for "and" and
106         "or".
107
108         For example, the expression
109
110           $email_header =~ /^From: .*\@coppit.org/ && $email =~ /grepmail/i
111
112         will find all emails which originate from coppit.org (you must escape
113         the "@" sign with a backslash), and which contain the keyword
114         "grepmail" anywhere in the message, in any capitalization.
115
116         -E is incompatible with -b, -h, and -e. -i, -M, -S, and -Y have not
117         yet been implemented.
118
119         NOTE: The syntax of search expressions may change in the future. In
120         particular, support for size, date, and other constraints may be
121         added. The syntax may also be simplified in order to make expression
122         formation easier to use (and perhaps at the expense of reduced
123         functionality).
124
125       -e
126         Explicitly specify the search pattern. This is useful for specifying
127         patterns that begin with "-", which would otherwise be interpreted as
128         a flag.
129
130       -f
131         Obtain patterns from FILE, one per line.  The  empty  file  contains
132         zero patterns, and therefore matches nothing.
133
134       -F
135         Force grepmail to process all files and streams as though they were
136         mailboxes.  (i.e. Skip checks for non-mailbox ASCII files or binary
137         files that don't look like they are compressed using known schemes.)
138
139       -h
140         Asserts that the pattern must match in the header of the email.
141
142       -H
143         Print the header but not body of matching emails.
144
145       -i
146         Make the search case-insensitive (by analogy to grep -i).
147
148       -j
149         Asserts that the email "Status:" header must contain the given flags.
150         Order and case are not important, so use -j AR or -j ra to search for
151         emails which have been read and answered.
152
153       -l
154         Output the names of files having an email matching the expression,
155         (by analogy to grep -l).
156
157       -L
158         Follow symbolic links. (Implies -R)
159
160       -M
161         Causes grepmail to ignore non-text MIME attachments. This removes
162         false positives resulting from binaries encoded as ASCII attachments.
163
164       -m
165         Append "X-Mailfolder: <folder>" to all email headers, indicating
166         which folder contained the matched email.
167
168       -n
169         Prefix each line with line number information. If multiple files are
170         specified, the filename will precede the line number. NOTE: When used
171         in conjunction with -m, the X-Mailfolder header has the same line
172         number as the next (blank) line.
173
174       -q
175         Quiet mode. Suppress the output of warning messages about non-mailbox
176         files, directories, etc.
177
178       -r
179         Generate a report of the names of the files containing emails
180         matching the expression, along with a count of the number of matching
181         emails.
182
183       -R
184         Causes grepmail to recurse any directories encountered.
185
186       -s
187         Return emails which match the size (in bytes) specified with this
188         flag. Note that this size includes the length of the header.
189
190         Size constraints must be of the form of:
191          - 12345: match size of exactly 12345
192          - <12345, <=12345, >12345, >=12345: match size less than, less than
193         or equal,
194            greater than, or greater than or equal to 12345
195          - 10000-12345: match size between 10000 and 12345 inclusive
196
197       -S
198         Ignore signatures. The signature consists of everything after a line
199         consisting of "-- ".
200
201       -u
202         Output only unique emails, by analogy to sort -u. Grepmail determines
203         email uniqueness by the Message-ID header.
204
205       -v
206         Invert the sense of the search, by analogy to grep -v. This results
207         in the set of emails printed being the complement of those that would
208         be printed without the -v switch.
209
210       -V
211         Print the version and exit.
212
213       -w
214         Search for only those lines which contain the pattern as part of a
215         word group.  That is, the start of the pattern must match the start
216         of a word, and the end of the pattern must match the end of a word.
217         (Note that the start and end need not be for the same word.)
218
219         If you are familiar with Perl regular expressions, this flag simply
220         puts a "\b" before and after the search pattern.
221
222       -X
223         Specify a regular expression for the signature separator. By default
224         this pattern is '^-- $'.
225
226       -Y
227         Specify a pattern which indicates specific headers to be searched.
228         The search will automatically treat headers which span multiple lines
229         as one long line.  This flag implies -h.
230
231         In the style of procmail, special strings in the pattern will be
232         expanded as follows:
233
234           If the regular expression contains "^TO:" it will be substituted by
235
236             ^((Original-)?(Resent-)?(To|Cc|Bcc)|(X-Envelope|Apparently(-Resent)?)-To):
237
238           which should match all headers with destination addresses.
239
240           If the regular expression contains "^FROM_DAEMON:" it  will be
241           substituted by
242
243             (^(Mailing-List:|Precedence:.*(junk|bulk|list)|To: Multiple recipients of |(((Resent-)?(From|Sender)|X-Envelope-From):|>?From )([^>]*[^(.%@a-z0-9])?(Post(ma?(st(e?r)?|n)|office)|(send)?Mail(er)?|daemon|m(mdf|ajordomo)|n?uucp|LIST(SERV|proc)|NETSERV|o(wner|ps)|r(e(quest|sponse)|oot)|b(ounce|bs\.smtp)|echo|mirror|s(erv(ices?|er)|mtp(error)?|ystem)|A(dmin(istrator)?|MMGR|utoanswer))(([^).!:a-z0-9][-_a-z0-9]*)?[%@>\t ][^<)]*(\(.*\).*)?)?
244
245           which should catch mails coming from most daemons.
246
247           If  the regular expression contains "^FROM_MAILER:" it will be
248           substituted by
249
250             (^(((Resent-)?(From|Sender)|X-Envelope-From):|>?From)([^>]*[^(.%@a-z0-9])?(Post(ma(st(er)?|n)|office)|(send)?Mail(er)?|daemon|mmdf|n?uucp|ops|r(esponse|oot)|(bbs\.)?smtp(error)?|s(erv(ices?|er)|ystem)|A(dmin(istrator)?|MMGR))(([^).!:a-z0-9][-_a-z0-9]*)?[%@>\t][^<)]*(\(.*\).*)?)?$([^>]|$))
251
252           (a stripped down version of "^FROM_DAEMON:"), which should catch
253           mails coming from most mailer-daemons.
254
255           So, to search for all emails to or from "Andy":
256
257             grepmail -Y '(^TO:|^From:)' Andy mailbox
258
259       --help
260         Print a help message summarizing the usage.
261
262       --
263         All arguments following -- are treated as mail folders.
264

EXAMPLES

266       Count the number of emails. ("." matches every email.)
267
268         grepmail -r . sent-mail
269
270       Get all email between 2000 and 3000 bytes about books
271
272         grepmail books -s 2000-3000 sent-mail
273
274       Get all email that you mailed yesterday
275
276         grepmail -d yesterday sent-mail
277
278       Get all email that you mailed before the first thursday in June 1998
279       that pertains to research (requires Date::Manip):
280
281         grepmail research -d "before 1st thursday in June 1998" sent-mail
282
283       Get all email that you mailed before the first of June 1998 that
284       pertains to research:
285
286         grepmail research -d "before 6/1/98" sent-mail
287
288       Get all email you received since 8/20/98 that wasn't about research or
289       your job, ignoring case:
290
291         grepmail -iv "(research|job)" -d "since 8/20/98" saved-mail
292
293       Get all email about mime but not about Netscape. Constrain the search
294       to match the body, since most headers contain the text "mime":
295
296         grepmail -b mime saved-mail | grepmail Netscape -v
297
298       Print a list of all mailboxes containing a message from Rodney.
299       Constrain the search to the headers, since quoted emails may match the
300       pattern:
301
302         grepmail -hl "^From.*Rodney" saved-mail*
303
304       Find all emails with the text "Pilot" in both the header and the body:
305
306         grepmail -hb "Pilot" saved-mail*
307
308       Print a count of the number of messages about grepmail in all saved-
309       mail mailboxes:
310
311         grepmail -br grepmail saved-mail*
312
313       Remove any duplicates from a mailbox:
314
315         grepmail -u saved-mail
316
317       Convert a Gnus mailbox to mbox format:
318
319         grepmail . gnus-mailbox-dir/* > mbox
320
321       Search for all emails to or from an address (taking into account
322       wrapped headers and different header names):
323
324         grepmail -Y '(^TO:|^From:)' my@email.address saved-mail
325
326       Find all emails from postmasters:
327
328         grepmail -Y '^FROM_MAILER:' . saved-mail
329

FILES

331       grepmail will not create temporary files while decompressing compressed
332       archives. The last version to do this was 3.5. While the new design
333       uses more memory, the code is much simpler, and there is less chance
334       that email can be read by malicious third parties. Memory usage is
335       determined by the size of the largest email message in the mailbox.
336

ENVIRONMENT

338       The MAILDIR environment variable can be used to specify the default
339       mail directory. This directory will be searched if the specified
340       mailbox can not be found directly.
341
342       The HOME environment variable is also used to find mailboxes if they
343       can not be found directly. It is also used to store grepmail state
344       information such as its cache file.
345

BUGS AND LIMITATIONS

347       Patterns containing "$" may cause problems
348         Currently I look for "$" followed by a non-word character and replace
349         it with the line ending for the current file (either "\n" or "\r\n").
350         This may cause problems with complex patterns specified with -E, but
351         I'm not aware of any.
352
353       Mails without bodies cause problems
354         According to RFC 822, mail messages need not have message bodies.
355         I've found and removed one bug related to this. I'm not sure if there
356         are others.
357
358       Complex single-point dates not parsed correctly
359         If you specify a point date like "September 1, 2004", grepmail
360         creates a date range that includes the entire day of September 1,
361         2004. If you specify a complex point date such as "today", "1st
362         Monday in July", or "9/1/2004 at 0:00" grepmail may parse the time
363         incorrectly.
364
365         The reason for this problem is that Date::Manip, as of version 5.42,
366         forces default values for parsed dates and times. This means that
367         grepmail has a hard time determining whether the user supplied
368         certain time/date fields. (e.g. Did Date::Manip provide a default
369         time of 0:00, or did the user specify it?)  grepmail tries to work
370         around this problem, but the workaround is inherently incomplete in
371         some rare cases.
372
373       File names that look like flags cause problems.
374         In some special circumstances, grepmail will be confused by files
375         whose names look like flags. In such cases, use the -e flag to
376         specify the search pattern.
377

LICENSE

379       This code is distributed under the GNU General Public License (GPL).
380       See the file LICENSE in the distribution,
381       http://www.opensource.org/gpl-license.html, and
382       http://www.opensource.org/.
383

AUTHOR

385       David Coppit <david@coppit.org>
386

SEE ALSO

388       elm(1), mail(1), grep(1), perl(1), printmail(1), Mail::Internet(3),
389       procmailrc(5). Crocker, D.  H., Standard for the Format of Arpa
390       Internet Text Messages, RFC 822.
391
392
393
394perl v5.12.0                      2010-06-01                       GREPMAIL(1)
Impressum