1GREPMAIL(1)           User Contributed Perl Documentation          GREPMAIL(1)
2
3
4

NAME

6       grepmail - search mailboxes for mail matching a regular expression
7

SYNOPSIS

9         grepmail [--help|--version] [-abBDFhHilLmrRuvVw] [-C <cache-file>]
10           [-j <status>] [-s <sizespec>] [-d <date-specification>]
11           [-X <signature-pattern>] [-Y <header-pattern>]
12           [[-e] <pattern>|-E <expr>|-f <pattern-file>] <files...>
13

DESCRIPTION

15         grepmail looks for mail messages containing a pattern, and prints the
16         resulting messages on standard out.
17
18         By default grepmail looks in both header and body for the specified
19         pattern.
20
21         When redirected to a file, the result is another mailbox, which can,
22         in turn, be handled by standard User Agents, such as elm, or even
23         used as input for another instance of grepmail.
24
25         At least one of -E, -e, -d, -s, or -u must be specified. The pattern
26         is optional if -d, -s, and/or -u is used. The -e flag is optional if
27         there is no file whose name is the pattern. The -E option can be used
28         to specify complex search expressions involving logical operators.
29         (See below.)
30
31         If a mailbox can not be found, grepmail first searches the directory
32         specified by the MAILDIR environment variable (if one is defined),
33         then searches the $HOME/mail, $HOME/Mail, and $HOME/Mailbox
34         directories.
35

OPTIONS AND ARGUMENTS

37       Many of the options and arguments are analogous to those of grep.
38
39       pattern
40         The pattern to search for in the mail message.  May be any Perl
41         regular expression, but should be quoted on the command line to
42         protect against globbing (shell expansion). To search for more than
43         one pattern, use the form "(pattern1|pattern2|...)".
44
45         Note that complex pattern features such as "(?>...)" require that you
46         use a version of perl which supports them. You can use the pattern
47         "()" to indicate that you do not want to match anything. This is
48         useful if you want to initialize the cache without printing any
49         output.
50
51       mailbox
52         Mailboxes must be traditional, UNIX "/bin/mail" mailbox format.  The
53         mailboxes may be compressed by gzip, bzip2, lzip or xz, in which case
54         the associated compression tool must be installed on the system, as
55         well as a recent version of the Mail::Mbox::MessageParser Perl module
56         that supports the format.
57
58         If no mailbox is specified, takes input from stdin, which can be
59         compressed or not. grepmail's behavior is undefined when ASCII and
60         binary data is piped together as input.
61
62       -a
63         Use arrival date instead of sent date.
64
65       -b
66         Asserts that the pattern must match in the body of the email.
67
68       -B
69         Print the body but with only minimal ('From ', 'From:', 'Subject:',
70         'Date:') headers. This flag can be used with -H, in which case it
71         will print only short headers and no email bodies.
72
73       -C
74         Specifies the location of the cache file. The default is
75         $HOME/.grepmail-cache.
76
77       -D
78         Enable debug mode, which prints diagnostic messages.
79
80       -d
81         Date specifications must be of the form of:
82           - a date like "today", "yesterday", "5/18/93", "5 days ago", "5
83         weeks ago",
84           - OR "before", "after", or "since", followed by a date as defined
85         above,
86           - OR "between <date> and <date>", where <date> is defined as above.
87
88         Simple date expressions will first be parsed by Date::Parse. If this
89         fails, grepmail will attempt to parse the date with Date::Manip, if
90         the module is installed on the system. Use an empty pattern (i.e. -d
91         "") to find emails without a "Date: ..." line in the header.
92
93         Date specifications without times are interpreted as having a time of
94         midnight of that day (which is the morning), except for "after" and
95         "since" specifications, which are interpreted as midnight of the
96         following day.  For example, "between today and tomorrow" is the same
97         as simply "today", and returns emails whose date has the current day.
98         ("now" is interpreted as "today".) The date specification "after July
99         5th" will return emails whose date is midnight July 6th or later.
100
101       -E
102         Specify a complex search expression using logical operators. The
103         current syntax allows the user to specify search expressions using
104         Perl syntax. Three values can be used: $email (the entire email
105         message), $email_header (just the header), or $email_body (just the
106         body). A search is specified in the form "$email =~ /pattern/", and
107         multiple searches can be combined using "&&" and "||" for "and" and
108         "or".
109
110         For example, the expression
111
112           $email_header =~ /^From: .*\@coppit.org/ && $email =~ /grepmail/i
113
114         will find all emails which originate from coppit.org (you must escape
115         the "@" sign with a backslash), and which contain the keyword
116         "grepmail" anywhere in the message, in any capitalization.
117
118         -E is incompatible with -b, -h, and -e. -i, -M, -S, and -Y have not
119         yet been implemented.
120
121         NOTE: The syntax of search expressions may change in the future. In
122         particular, support for size, date, and other constraints may be
123         added. The syntax may also be simplified in order to make expression
124         formation easier to use (and perhaps at the expense of reduced
125         functionality).
126
127       -e
128         Explicitly specify the search pattern. This is useful for specifying
129         patterns that begin with "-", which would otherwise be interpreted as
130         a flag.
131
132       -f
133         Obtain patterns from FILE, one per line.  The  empty  file  contains
134         zero patterns, and therefore matches nothing.
135
136       -F
137         Force grepmail to process all files and streams as though they were
138         mailboxes.  (i.e. Skip checks for non-mailbox ASCII files or binary
139         files that don't look like they are compressed using known schemes.)
140
141       -h
142         Asserts that the pattern must match in the header of the email.
143
144       -H
145         Print the header but not body of matching emails.
146
147       -i
148         Make the search case-insensitive (by analogy to grep -i).
149
150       -j
151         Asserts that the email "Status:" header must contain the given flags.
152         Order and case are not important, so use -j AR or -j ra to search for
153         emails which have been read and answered.
154
155       -l
156         Output the names of files having an email matching the expression,
157         (by analogy to grep -l).
158
159       -L
160         Follow symbolic links. (Implies -R)
161
162       -M
163         Causes grepmail to ignore non-text MIME attachments. This removes
164         false positives resulting from binaries encoded as ASCII attachments.
165
166       -m
167         Append "X-Mailfolder: <folder>" to all email headers, indicating
168         which folder contained the matched email.
169
170       -n
171         Prefix each line with line number information. If multiple files are
172         specified, the filename will precede the line number. NOTE: When used
173         in conjunction with -m, the X-Mailfolder header has the same line
174         number as the next (blank) line.
175
176       -q
177         Quiet mode. Suppress the output of warning messages about non-mailbox
178         files, directories, etc.
179
180       -r
181         Generate a report of the names of the files containing emails
182         matching the expression, along with a count of the number of matching
183         emails.
184
185       -R
186         Causes grepmail to recurse any directories encountered.
187
188       -s
189         Return emails which match the size (in bytes) specified with this
190         flag. Note that this size includes the length of the header.
191
192         Size constraints must be of the form of:
193          - 12345: match size of exactly 12345
194          - <12345, <=12345, >12345, >=12345: match size less than, less than
195         or equal,
196            greater than, or greater than or equal to 12345
197          - 10000-12345: match size between 10000 and 12345 inclusive
198
199       -S
200         Ignore signatures. The signature consists of everything after a line
201         consisting of "-- ".
202
203       -u
204         Output only unique emails, by analogy to sort -u. Grepmail determines
205         email uniqueness by the Message-ID header.
206
207       -v
208         Invert the sense of the search, by analogy to grep -v. This results
209         in the set of emails printed being the complement of those that would
210         be printed without the -v switch.
211
212       -V
213         Print the version and exit.
214
215       -w
216         Search for only those lines which contain the pattern as part of a
217         word group.  That is, the start of the pattern must match the start
218         of a word, and the end of the pattern must match the end of a word.
219         (Note that the start and end need not be for the same word.)
220
221         If you are familiar with Perl regular expressions, this flag simply
222         puts a "\b" before and after the search pattern.
223
224       -X
225         Specify a regular expression for the signature separator. By default
226         this pattern is '^-- $'.
227
228       -Y
229         Specify a pattern which indicates specific headers to be searched.
230         The search will automatically treat headers which span multiple lines
231         as one long line.  This flag implies -h.
232
233         In the style of procmail, special strings in the pattern will be
234         expanded as follows:
235
236           If the regular expression contains "^TO:" it will be substituted by
237
238             ^((Original-)?(Resent-)?(To|Cc|Bcc)|(X-Envelope|Apparently(-Resent)?)-To):
239
240           which should match all headers with destination addresses.
241
242           If the regular expression contains "^FROM_DAEMON:" it  will be
243           substituted by
244
245             (^(Mailing-List:|Precedence:.*(junk|bulk|list)|To: Multiple recipients of |(((Resent-)?(From|Sender)|X-Envelope-From):|>?From )([^>]*[^(.%@a-z0-9])?(Post(ma?(st(e?r)?|n)|office)|(send)?Mail(er)?|daemon|m(mdf|ajordomo)|n?uucp|LIST(SERV|proc)|NETSERV|o(wner|ps)|r(e(quest|sponse)|oot)|b(ounce|bs\.smtp)|echo|mirror|s(erv(ices?|er)|mtp(error)?|ystem)|A(dmin(istrator)?|MMGR|utoanswer))(([^).!:a-z0-9][-_a-z0-9]*)?[%@>\t ][^<)]*(\(.*\).*)?)?
246
247           which should catch mails coming from most daemons.
248
249           If  the regular expression contains "^FROM_MAILER:" it will be
250           substituted by
251
252             (^(((Resent-)?(From|Sender)|X-Envelope-From):|>?From)([^>]*[^(.%@a-z0-9])?(Post(ma(st(er)?|n)|office)|(send)?Mail(er)?|daemon|mmdf|n?uucp|ops|r(esponse|oot)|(bbs\.)?smtp(error)?|s(erv(ices?|er)|ystem)|A(dmin(istrator)?|MMGR))(([^).!:a-z0-9][-_a-z0-9]*)?[%@>\t][^<)]*(\(.*\).*)?)?$([^>]|$))
253
254           (a stripped down version of "^FROM_DAEMON:"), which should catch
255           mails coming from most mailer-daemons.
256
257           So, to search for all emails to or from "Andy":
258
259             grepmail -Y '(^TO:|^From:)' Andy mailbox
260
261       --help
262         Print a help message summarizing the usage.
263
264       --
265         All arguments following -- are treated as mail folders.
266

EXAMPLES

268       Count the number of emails. ("." matches every email.)
269
270         grepmail -r . sent-mail
271
272       Get all email between 2000 and 3000 bytes about books
273
274         grepmail books -s 2000-3000 sent-mail
275
276       Get all email that you mailed yesterday
277
278         grepmail -d yesterday sent-mail
279
280       Get all email that you mailed before the first thursday in June 1998
281       that pertains to research (requires Date::Manip):
282
283         grepmail research -d "before 1st thursday in June 1998" sent-mail
284
285       Get all email that you mailed before the first of June 1998 that
286       pertains to research:
287
288         grepmail research -d "before 6/1/98" sent-mail
289
290       Get all email you received since 8/20/98 that wasn't about research or
291       your job, ignoring case:
292
293         grepmail -iv "(research|job)" -d "since 8/20/98" saved-mail
294
295       Get all email about mime but not about Netscape. Constrain the search
296       to match the body, since most headers contain the text "mime":
297
298         grepmail -b mime saved-mail | grepmail Netscape -v
299
300       Print a list of all mailboxes containing a message from Rodney.
301       Constrain the search to the headers, since quoted emails may match the
302       pattern:
303
304         grepmail -hl "^From.*Rodney" saved-mail*
305
306       Find all emails with the text "Pilot" in both the header and the body:
307
308         grepmail -hb "Pilot" saved-mail*
309
310       Print a count of the number of messages about grepmail in all saved-
311       mail mailboxes:
312
313         grepmail -br grepmail saved-mail*
314
315       Remove any duplicates from a mailbox:
316
317         grepmail -u saved-mail
318
319       Convert a Gnus mailbox to mbox format:
320
321         grepmail . gnus-mailbox-dir/* > mbox
322
323       Search for all emails to or from an address (taking into account
324       wrapped headers and different header names):
325
326         grepmail -Y '(^TO:|^From:)' my@email.address saved-mail
327
328       Find all emails from postmasters:
329
330         grepmail -Y '^FROM_MAILER:' . saved-mail
331

FILES

333       grepmail will not create temporary files while decompressing compressed
334       archives. The last version to do this was 3.5. While the new design
335       uses more memory, the code is much simpler, and there is less chance
336       that email can be read by malicious third parties. Memory usage is
337       determined by the size of the largest email message in the mailbox.
338

ENVIRONMENT

340       The MAILDIR environment variable can be used to specify the default
341       mail directory. This directory will be searched if the specified
342       mailbox can not be found directly.
343
344       The HOME environment variable is also used to find mailboxes if they
345       can not be found directly. It is also used to store grepmail state
346       information such as its cache file.
347

BUGS AND LIMITATIONS

349       Patterns containing "$" may cause problems
350         Currently I look for "$" followed by a non-word character and replace
351         it with the line ending for the current file (either "\n" or "\r\n").
352         This may cause problems with complex patterns specified with -E, but
353         I'm not aware of any.
354
355       Mails without bodies cause problems
356         According to RFC 822, mail messages need not have message bodies.
357         I've found and removed one bug related to this. I'm not sure if there
358         are others.
359
360       Complex single-point dates not parsed correctly
361         If you specify a point date like "September 1, 2004", grepmail
362         creates a date range that includes the entire day of September 1,
363         2004. If you specify a complex point date such as "today", "1st
364         Monday in July", or "9/1/2004 at 0:00" grepmail may parse the time
365         incorrectly.
366
367         The reason for this problem is that Date::Manip, as of version 5.42,
368         forces default values for parsed dates and times. This means that
369         grepmail has a hard time determining whether the user supplied
370         certain time/date fields. (e.g. Did Date::Manip provide a default
371         time of 0:00, or did the user specify it?)  grepmail tries to work
372         around this problem, but the workaround is inherently incomplete in
373         some rare cases.
374
375       File names that look like flags cause problems.
376         In some special circumstances, grepmail will be confused by files
377         whose names look like flags. In such cases, use the -e flag to
378         specify the search pattern.
379

LICENSE

381       This code is distributed under the GNU General Public License (GPL)
382       Version 2.  See the file LICENSE in the distribution for details.
383

AUTHOR

385       David Coppit <david@coppit.org>
386

SEE ALSO

388       elm(1), mail(1), grep(1), perl(1), printmail(1), Mail::Internet(3),
389       procmailrc(5). Crocker, D.  H., Standard for the Format of Arpa
390       Internet Text Messages, RFC 822.
391
392
393
394perl v5.36.0                      2023-01-19                       GREPMAIL(1)
Impressum