1GREPMAIL(1) User Contributed Perl Documentation GREPMAIL(1)
2
3
4
6 grepmail - search mailboxes for mail matching a regular expression
7
9 grepmail [--help|--version] [-abBDFhHilLmrRuvVw] [-C <cache-file>]
10 [-j <status>] [-s <sizespec>] [-d <date-specification>]
11 [-X <signature-pattern>] [-Y <header-pattern>]
12 [[-e] <pattern>|-E <expr>|-f <pattern-file>] <files...>
13
15 grepmail looks for mail messages containing a pattern, and prints the
16 resulting messages on standard out.
17
18 By default grepmail looks in both header and body for the specified
19 pattern.
20
21 When redirected to a file, the result is another mailbox, which can,
22 in turn, be handled by standard User Agents, such as elm, or even
23 used as input for another instance of grepmail.
24
25 At least one of -E, -e, -d, -s, or -u must be specified. The pattern
26 is optional if -d, -s, and/or -u is used. The -e flag is optional if
27 there is no file whose name is the pattern. The -E option can be used
28 to specify complex search expressions involving logical operators.
29 (See below.)
30
31 If a mailbox can not be found, grepmail first searches the directory
32 specified by the MAILDIR environment variable (if one is defined),
33 then searches the $HOME/mail, $HOME/Mail, and $HOME/Mailbox
34 directories.
35
37 Many of the options and arguments are analogous to those of grep.
38
39 pattern
40 The pattern to search for in the mail message. May be any Perl
41 regular expression, but should be quoted on the command line to
42 protect against globbing (shell expansion). To search for more than
43 one pattern, use the form "(pattern1|pattern2|...)".
44
45 Note that complex pattern features such as "(?>...)" require that you
46 use a version of perl which supports them. You can use the pattern
47 "()" to indicate that you do not want to match anything. This is
48 useful if you want to initialize the cache without printing any
49 output.
50
51 mailbox
52 Mailboxes must be traditional, UNIX "/bin/mail" mailbox format. The
53 mailboxes may be compressed by gzip, bzip2, lzip or xz, in which case
54 the associated compression tool must be installed on the system, as
55 well as a recent version of the Mail::Mbox::MessageParser Perl module
56 that supports the format.
57
58 If no mailbox is specified, takes input from stdin, which can be
59 compressed or not. grepmail's behavior is undefined when ASCII and
60 binary data is piped together as input.
61
62 -a
63 Use arrival date instead of sent date.
64
65 -b
66 Asserts that the pattern must match in the body of the email.
67
68 -B
69 Print the body but with only minimal ('From ', 'From:', 'Subject:',
70 'Date:') headers. This flag can be used with -H, in which case it
71 will print only short headers and no email bodies.
72
73 -C
74 Specifies the location of the cache file. The default is
75 $HOME/.grepmail-cache.
76
77 -D
78 Enable debug mode, which prints diagnostic messages.
79
80 -d
81 Date specifications must be of the form of:
82 - a date like "today", "yesterday", "5/18/93", "5 days ago", "5
83 weeks ago",
84 - OR "before", "after", or "since", followed by a date as defined
85 above,
86 - OR "between <date> and <date>", where <date> is defined as above.
87
88 Simple date expressions will first be parsed by Date::Parse. If this
89 fails, grepmail will attempt to parse the date with Date::Manip, if
90 the module is installed on the system. Use an empty pattern (i.e. -d
91 "") to find emails without a "Date: ..." line in the header.
92
93 Date specifications without times are interpreted as having a time of
94 midnight of that day (which is the morning), except for "after" and
95 "since" specifications, which are interpreted as midnight of the
96 following day. For example, "between today and tomorrow" is the same
97 as simply "today", and returns emails whose date has the current day.
98 ("now" is interpreted as "today".) The date specification "after July
99 5th" will return emails whose date is midnight July 6th or later.
100
101 -E
102 Specify a complex search expression using logical operators. The
103 current syntax allows the user to specify search expressions using
104 Perl syntax. Three values can be used: $email (the entire email
105 message), $email_header (just the header), or $email_body (just the
106 body). A search is specified in the form "$email =~ /pattern/", and
107 multiple searches can be combined using "&&" and "||" for "and" and
108 "or".
109
110 For example, the expression
111
112 $email_header =~ /^From: .*\@coppit.org/ && $email =~ /grepmail/i
113
114 will find all emails which originate from coppit.org (you must escape
115 the "@" sign with a backslash), and which contain the keyword
116 "grepmail" anywhere in the message, in any capitalization.
117
118 -E is incompatible with -b, -h, and -e. -i, -M, -S, and -Y have not
119 yet been implemented.
120
121 NOTE: The syntax of search expressions may change in the future. In
122 particular, support for size, date, and other constraints may be
123 added. The syntax may also be simplified in order to make expression
124 formation easier to use (and perhaps at the expense of reduced
125 functionality).
126
127 -e
128 Explicitly specify the search pattern. This is useful for specifying
129 patterns that begin with "-", which would otherwise be interpreted as
130 a flag.
131
132 -f
133 Obtain patterns from FILE, one per line. The empty file contains
134 zero patterns, and therefore matches nothing.
135
136 -F
137 Force grepmail to process all files and streams as though they were
138 mailboxes. (i.e. Skip checks for non-mailbox ASCII files or binary
139 files that don't look like they are compressed using known schemes.)
140
141 -h
142 Asserts that the pattern must match in the header of the email.
143
144 -H
145 Print the header but not body of matching emails.
146
147 -i
148 Make the search case-insensitive (by analogy to grep -i).
149
150 -j
151 Asserts that the email "Status:" header must contain the given flags.
152 Order and case are not important, so use -j AR or -j ra to search for
153 emails which have been read and answered.
154
155 -l
156 Output the names of files having an email matching the expression,
157 (by analogy to grep -l).
158
159 -L
160 Follow symbolic links. (Implies -R)
161
162 -M
163 Causes grepmail to ignore non-text MIME attachments. This removes
164 false positives resulting from binaries encoded as ASCII attachments.
165
166 -m
167 Append "X-Mailfolder: <folder>" to all email headers, indicating
168 which folder contained the matched email.
169
170 -n
171 Prefix each line with line number information. If multiple files are
172 specified, the filename will precede the line number. NOTE: When used
173 in conjunction with -m, the X-Mailfolder header has the same line
174 number as the next (blank) line.
175
176 -q
177 Quiet mode. Suppress the output of warning messages about non-mailbox
178 files, directories, etc.
179
180 -r
181 Generate a report of the names of the files containing emails
182 matching the expression, along with a count of the number of matching
183 emails.
184
185 -R
186 Causes grepmail to recurse any directories encountered.
187
188 -s
189 Return emails which match the size (in bytes) specified with this
190 flag. Note that this size includes the length of the header.
191
192 Size constraints must be of the form of:
193 - 12345: match size of exactly 12345
194 - <12345, <=12345, >12345, >=12345: match size less than, less than
195 or equal,
196 greater than, or greater than or equal to 12345
197 - 10000-12345: match size between 10000 and 12345 inclusive
198
199 -S
200 Ignore signatures. The signature consists of everything after a line
201 consisting of "-- ".
202
203 -u
204 Output only unique emails, by analogy to sort -u. Grepmail determines
205 email uniqueness by the Message-ID header.
206
207 -v
208 Invert the sense of the search, by analogy to grep -v. This results
209 in the set of emails printed being the complement of those that would
210 be printed without the -v switch.
211
212 -V
213 Print the version and exit.
214
215 -w
216 Search for only those lines which contain the pattern as part of a
217 word group. That is, the start of the pattern must match the start
218 of a word, and the end of the pattern must match the end of a word.
219 (Note that the start and end need not be for the same word.)
220
221 If you are familiar with Perl regular expressions, this flag simply
222 puts a "\b" before and after the search pattern.
223
224 -X
225 Specify a regular expression for the signature separator. By default
226 this pattern is '^-- $'.
227
228 -Y
229 Specify a pattern which indicates specific headers to be searched.
230 The search will automatically treat headers which span multiple lines
231 as one long line. This flag implies -h.
232
233 In the style of procmail, special strings in the pattern will be
234 expanded as follows:
235
236 If the regular expression contains "^TO:" it will be substituted by
237
238 ^((Original-)?(Resent-)?(To|Cc|Bcc)|(X-Envelope|Apparently(-Resent)?)-To):
239
240 which should match all headers with destination addresses.
241
242 If the regular expression contains "^FROM_DAEMON:" it will be
243 substituted by
244
245 (^(Mailing-List:|Precedence:.*(junk|bulk|list)|To: Multiple recipients of |(((Resent-)?(From|Sender)|X-Envelope-From):|>?From )([^>]*[^(.%@a-z0-9])?(Post(ma?(st(e?r)?|n)|office)|(send)?Mail(er)?|daemon|m(mdf|ajordomo)|n?uucp|LIST(SERV|proc)|NETSERV|o(wner|ps)|r(e(quest|sponse)|oot)|b(ounce|bs\.smtp)|echo|mirror|s(erv(ices?|er)|mtp(error)?|ystem)|A(dmin(istrator)?|MMGR|utoanswer))(([^).!:a-z0-9][-_a-z0-9]*)?[%@>\t ][^<)]*(\(.*\).*)?)?
246
247 which should catch mails coming from most daemons.
248
249 If the regular expression contains "^FROM_MAILER:" it will be
250 substituted by
251
252 (^(((Resent-)?(From|Sender)|X-Envelope-From):|>?From)([^>]*[^(.%@a-z0-9])?(Post(ma(st(er)?|n)|office)|(send)?Mail(er)?|daemon|mmdf|n?uucp|ops|r(esponse|oot)|(bbs\.)?smtp(error)?|s(erv(ices?|er)|ystem)|A(dmin(istrator)?|MMGR))(([^).!:a-z0-9][-_a-z0-9]*)?[%@>\t][^<)]*(\(.*\).*)?)?$([^>]|$))
253
254 (a stripped down version of "^FROM_DAEMON:"), which should catch
255 mails coming from most mailer-daemons.
256
257 So, to search for all emails to or from "Andy":
258
259 grepmail -Y '(^TO:|^From:)' Andy mailbox
260
261 --help
262 Print a help message summarizing the usage.
263
264 --
265 All arguments following -- are treated as mail folders.
266
268 Count the number of emails. ("." matches every email.)
269
270 grepmail -r . sent-mail
271
272 Get all email between 2000 and 3000 bytes about books
273
274 grepmail books -s 2000-3000 sent-mail
275
276 Get all email that you mailed yesterday
277
278 grepmail -d yesterday sent-mail
279
280 Get all email that you mailed before the first thursday in June 1998
281 that pertains to research (requires Date::Manip):
282
283 grepmail research -d "before 1st thursday in June 1998" sent-mail
284
285 Get all email that you mailed before the first of June 1998 that
286 pertains to research:
287
288 grepmail research -d "before 6/1/98" sent-mail
289
290 Get all email you received since 8/20/98 that wasn't about research or
291 your job, ignoring case:
292
293 grepmail -iv "(research|job)" -d "since 8/20/98" saved-mail
294
295 Get all email about mime but not about Netscape. Constrain the search
296 to match the body, since most headers contain the text "mime":
297
298 grepmail -b mime saved-mail | grepmail Netscape -v
299
300 Print a list of all mailboxes containing a message from Rodney.
301 Constrain the search to the headers, since quoted emails may match the
302 pattern:
303
304 grepmail -hl "^From.*Rodney" saved-mail*
305
306 Find all emails with the text "Pilot" in both the header and the body:
307
308 grepmail -hb "Pilot" saved-mail*
309
310 Print a count of the number of messages about grepmail in all saved-
311 mail mailboxes:
312
313 grepmail -br grepmail saved-mail*
314
315 Remove any duplicates from a mailbox:
316
317 grepmail -u saved-mail
318
319 Convert a Gnus mailbox to mbox format:
320
321 grepmail . gnus-mailbox-dir/* > mbox
322
323 Search for all emails to or from an address (taking into account
324 wrapped headers and different header names):
325
326 grepmail -Y '(^TO:|^From:)' my@email.address saved-mail
327
328 Find all emails from postmasters:
329
330 grepmail -Y '^FROM_MAILER:' . saved-mail
331
333 grepmail will not create temporary files while decompressing compressed
334 archives. The last version to do this was 3.5. While the new design
335 uses more memory, the code is much simpler, and there is less chance
336 that email can be read by malicious third parties. Memory usage is
337 determined by the size of the largest email message in the mailbox.
338
340 The MAILDIR environment variable can be used to specify the default
341 mail directory. This directory will be searched if the specified
342 mailbox can not be found directly.
343
344 The HOME environment variable is also used to find mailboxes if they
345 can not be found directly. It is also used to store grepmail state
346 information such as its cache file.
347
349 Patterns containing "$" may cause problems
350 Currently I look for "$" followed by a non-word character and replace
351 it with the line ending for the current file (either "\n" or "\r\n").
352 This may cause problems with complex patterns specified with -E, but
353 I'm not aware of any.
354
355 Mails without bodies cause problems
356 According to RFC 822, mail messages need not have message bodies.
357 I've found and removed one bug related to this. I'm not sure if there
358 are others.
359
360 Complex single-point dates not parsed correctly
361 If you specify a point date like "September 1, 2004", grepmail
362 creates a date range that includes the entire day of September 1,
363 2004. If you specify a complex point date such as "today", "1st
364 Monday in July", or "9/1/2004 at 0:00" grepmail may parse the time
365 incorrectly.
366
367 The reason for this problem is that Date::Manip, as of version 5.42,
368 forces default values for parsed dates and times. This means that
369 grepmail has a hard time determining whether the user supplied
370 certain time/date fields. (e.g. Did Date::Manip provide a default
371 time of 0:00, or did the user specify it?) grepmail tries to work
372 around this problem, but the workaround is inherently incomplete in
373 some rare cases.
374
375 File names that look like flags cause problems.
376 In some special circumstances, grepmail will be confused by files
377 whose names look like flags. In such cases, use the -e flag to
378 specify the search pattern.
379
381 This code is distributed under the GNU General Public License (GPL)
382 Version 2. See the file LICENSE in the distribution for details.
383
385 David Coppit <david@coppit.org>
386
388 elm(1), mail(1), grep(1), perl(1), printmail(1), Mail::Internet(3),
389 procmailrc(5). Crocker, D. H., Standard for the Format of Arpa
390 Internet Text Messages, RFC 822.
391
392
393
394perl v5.32.1 2021-01-26 GREPMAIL(1)