1GREPMAIL(1) User Contributed Perl Documentation GREPMAIL(1)
2
3
4
6 grepmail - search mailboxes for mail matching a regular expression
7
9 grepmail [--help|--version] [-abBDFhHilLmrRuvVw] [-C <cache-file>]
10 [-j <status>] [-s <sizespec>] [-d <date-specification>]
11 [-X <signature-pattern>] [-Y <header-pattern>]
12 [[-e] <pattern>|-E <expr>|-f <pattern-file>] <files...>
13
15 grepmail looks for mail messages containing a pattern, and prints the
16 resulting messages on standard out.
17
18 By default grepmail looks in both header and body for the specified
19 pattern.
20
21 When redirected to a file, the result is another mailbox, which can,
22 in turn, be handled by standard User Agents, such as elm, or even
23 used as input for another instance of grepmail.
24
25 At least one of -E, -e, -d, -s, or -u must be specified. The pattern
26 is optional if -d, -s, and/or -u is used. The -e flag is optional if
27 there is no file whose name is the pattern. The -E option can be used
28 to specify complex search expressions involving logical operators.
29 (See below.)
30
31 If a mailbox can not be found, grepmail first searches the directory
32 specified by the MAILDIR environment variable (if one is defined),
33 then searches the $HOME/mail, $HOME/Mail, and $HOME/Mailbox
34 directories.
35
37 Many of the options and arguments are analogous to those of grep.
38
39 pattern
40 The pattern to search for in the mail message. May be any Perl
41 regular expression, but should be quoted on the command line to
42 protect against globbing (shell expansion). To search for more than
43 one pattern, use the form "(pattern1|pattern2|...)".
44
45 Note that complex pattern features such as "(?>...)" require that you
46 use a version of perl which supports them. You can use the pattern
47 "()" to indicate that you do not want to match anything. This is
48 useful if you want to initialize the cache without printing any
49 output.
50
51 mailbox
52 Mailboxes must be traditional, UNIX "/bin/mail" mailbox format. The
53 mailboxes may be compressed by gzip, or bzip2, in which case gunzip,
54 or bzip2 must be installed on the system.
55
56 If no mailbox is specified, takes input from stdin, which can be
57 compressed or not. grepmail's behavior is undefined when ASCII and
58 binary data is piped together as input.
59
60 -a
61 Use arrival date instead of sent date.
62
63 -b
64 Asserts that the pattern must match in the body of the email.
65
66 -B
67 Print the body but with only minimal ('From ', 'From:', 'Subject:',
68 'Date:') headers. This flag can be used with -H, in which case it
69 will print only short headers and no email bodies.
70
71 -C
72 Specifies the location of the cache file. The default is
73 $HOME/.grepmail-cache.
74
75 -D
76 Enable debug mode, which prints diagnostic messages.
77
78 -d
79 Date specifications must be of the form of:
80 - a date like "today", "yesterday", "5/18/93", "5 days ago", "5
81 weeks ago",
82 - OR "before", "after", or "since", followed by a date as defined
83 above,
84 - OR "between <date> and <date>", where <date> is defined as above.
85
86 Simple date expressions will first be parsed by Date::Parse. If this
87 fails, grepmail will attempt to parse the date with Date::Manip, if
88 the module is installed on the system. Use an empty pattern (i.e. -d
89 "") to find emails without a "Date: ..." line in the header.
90
91 Date specifications without times are interpreted as having a time of
92 midnight of that day (which is the morning), except for "after" and
93 "since" specifications, which are interpreted as midnight of the
94 following day. For example, "between today and tomorrow" is the same
95 as simply "today", and returns emails whose date has the current day.
96 ("now" is interpreted as "today".) The date specification "after July
97 5th" will return emails whose date is midnight July 6th or later.
98
99 -E
100 Specify a complex search expression using logical operators. The
101 current syntax allows the user to specify search expressions using
102 Perl syntax. Three values can be used: $email (the entire email
103 message), $email_header (just the header), or $email_body (just the
104 body). A search is specified in the form "$email =~ /pattern/", and
105 multiple searches can be combined using "&&" and "||" for "and" and
106 "or".
107
108 For example, the expression
109
110 $email_header =~ /^From: .*\@coppit.org/ && $email =~ /grepmail/i
111
112 will find all emails which originate from coppit.org (you must escape
113 the "@" sign with a backslash), and which contain the keyword
114 "grepmail" anywhere in the message, in any capitalization.
115
116 -E is incompatible with -b, -h, and -e. -i, -M, -S, and -Y have not
117 yet been implemented.
118
119 NOTE: The syntax of search expressions may change in the future. In
120 particular, support for size, date, and other constraints may be
121 added. The syntax may also be simplified in order to make expression
122 formation easier to use (and perhaps at the expense of reduced
123 functionality).
124
125 -e
126 Explicitly specify the search pattern. This is useful for specifying
127 patterns that begin with "-", which would otherwise be interpreted as
128 a flag.
129
130 -f
131 Obtain patterns from FILE, one per line. The empty file contains
132 zero patterns, and therefore matches nothing.
133
134 -F
135 Force grepmail to process all files and streams as though they were
136 mailboxes. (i.e. Skip checks for non-mailbox ASCII files or binary
137 files that don't look like they are compressed using known schemes.)
138
139 -h
140 Asserts that the pattern must match in the header of the email.
141
142 -H
143 Print the header but not body of matching emails.
144
145 -i
146 Make the search case-insensitive (by analogy to grep -i).
147
148 -j
149 Asserts that the email "Status:" header must contain the given flags.
150 Order and case are not important, so use -j AR or -j ra to search for
151 emails which have been read and answered.
152
153 -l
154 Output the names of files having an email matching the expression,
155 (by analogy to grep -l).
156
157 -L
158 Follow symbolic links. (Implies -R)
159
160 -M
161 Causes grepmail to ignore non-text MIME attachments. This removes
162 false positives resulting from binaries encoded as ASCII attachments.
163
164 -m
165 Append "X-Mailfolder: <folder>" to all email headers, indicating
166 which folder contained the matched email.
167
168 -n
169 Prefix each line with line number information. If multiple files are
170 specified, the filename will precede the line number. NOTE: When used
171 in conjunction with -m, the X-Mailfolder header has the same line
172 number as the next (blank) line.
173
174 -q
175 Quiet mode. Suppress the output of warning messages about non-mailbox
176 files, directories, etc.
177
178 -r
179 Generate a report of the names of the files containing emails
180 matching the expression, along with a count of the number of matching
181 emails.
182
183 -R
184 Causes grepmail to recurse any directories encountered.
185
186 -s
187 Return emails which match the size (in bytes) specified with this
188 flag. Note that this size includes the length of the header.
189
190 Size constraints must be of the form of:
191 - 12345: match size of exactly 12345
192 - <12345, <=12345, >12345, >=12345: match size less than, less than
193 or equal,
194 greater than, or greater than or equal to 12345
195 - 10000-12345: match size between 10000 and 12345 inclusive
196
197 -S
198 Ignore signatures. The signature consists of everything after a line
199 consisting of "-- ".
200
201 -u
202 Output only unique emails, by analogy to sort -u. Grepmail determines
203 email uniqueness by the Message-ID header.
204
205 -v
206 Invert the sense of the search, by analogy to grep -v. This results
207 in the set of emails printed being the complement of those that would
208 be printed without the -v switch.
209
210 -V
211 Print the version and exit.
212
213 -w
214 Search for only those lines which contain the pattern as part of a
215 word group. That is, the start of the pattern must match the start
216 of a word, and the end of the pattern must match the end of a word.
217 (Note that the start and end need not be for the same word.)
218
219 If you are familiar with Perl regular expressions, this flag simply
220 puts a "\b" before and after the search pattern.
221
222 -X
223 Specify a regular expression for the signature separator. By default
224 this pattern is '^-- $'.
225
226 -Y
227 Specify a pattern which indicates specific headers to be searched.
228 The search will automatically treat headers which span multiple lines
229 as one long line. This flag implies -h.
230
231 In the style of procmail, special strings in the pattern will be
232 expanded as follows:
233
234 If the regular expression contains "^TO:" it will be substituted by
235
236 ^((Original-)?(Resent-)?(To|Cc|Bcc)|(X-Envelope|Apparently(-Resent)?)-To):
237
238 which should match all headers with destination addresses.
239
240 If the regular expression contains "^FROM_DAEMON:" it will be
241 substituted by
242
243 (^(Mailing-List:|Precedence:.*(junk|bulk|list)|To: Multiple recipients of |(((Resent-)?(From|Sender)|X-Envelope-From):|>?From )([^>]*[^(.%@a-z0-9])?(Post(ma?(st(e?r)?|n)|office)|(send)?Mail(er)?|daemon|m(mdf|ajordomo)|n?uucp|LIST(SERV|proc)|NETSERV|o(wner|ps)|r(e(quest|sponse)|oot)|b(ounce|bs\.smtp)|echo|mirror|s(erv(ices?|er)|mtp(error)?|ystem)|A(dmin(istrator)?|MMGR|utoanswer))(([^).!:a-z0-9][-_a-z0-9]*)?[%@>\t ][^<)]*(\(.*\).*)?)?
244
245 which should catch mails coming from most daemons.
246
247 If the regular expression contains "^FROM_MAILER:" it will be
248 substituted by
249
250 (^(((Resent-)?(From|Sender)|X-Envelope-From):|>?From)([^>]*[^(.%@a-z0-9])?(Post(ma(st(er)?|n)|office)|(send)?Mail(er)?|daemon|mmdf|n?uucp|ops|r(esponse|oot)|(bbs\.)?smtp(error)?|s(erv(ices?|er)|ystem)|A(dmin(istrator)?|MMGR))(([^).!:a-z0-9][-_a-z0-9]*)?[%@>\t][^<)]*(\(.*\).*)?)?$([^>]|$))
251
252 (a stripped down version of "^FROM_DAEMON:"), which should catch
253 mails coming from most mailer-daemons.
254
255 So, to search for all emails to or from "Andy":
256
257 grepmail -Y '(^TO:|^From:)' Andy mailbox
258
259 --help
260 Print a help message summarizing the usage.
261
262 --
263 All arguments following -- are treated as mail folders.
264
266 Count the number of emails. ("." matches every email.)
267
268 grepmail -r . sent-mail
269
270 Get all email between 2000 and 3000 bytes about books
271
272 grepmail books -s 2000-3000 sent-mail
273
274 Get all email that you mailed yesterday
275
276 grepmail -d yesterday sent-mail
277
278 Get all email that you mailed before the first thursday in June 1998
279 that pertains to research (requires Date::Manip):
280
281 grepmail research -d "before 1st thursday in June 1998" sent-mail
282
283 Get all email that you mailed before the first of June 1998 that
284 pertains to research:
285
286 grepmail research -d "before 6/1/98" sent-mail
287
288 Get all email you received since 8/20/98 that wasn't about research or
289 your job, ignoring case:
290
291 grepmail -iv "(research|job)" -d "since 8/20/98" saved-mail
292
293 Get all email about mime but not about Netscape. Constrain the search
294 to match the body, since most headers contain the text "mime":
295
296 grepmail -b mime saved-mail | grepmail Netscape -v
297
298 Print a list of all mailboxes containing a message from Rodney.
299 Constrain the search to the headers, since quoted emails may match the
300 pattern:
301
302 grepmail -hl "^From.*Rodney" saved-mail*
303
304 Find all emails with the text "Pilot" in both the header and the body:
305
306 grepmail -hb "Pilot" saved-mail*
307
308 Print a count of the number of messages about grepmail in all saved-
309 mail mailboxes:
310
311 grepmail -br grepmail saved-mail*
312
313 Remove any duplicates from a mailbox:
314
315 grepmail -u saved-mail
316
317 Convert a Gnus mailbox to mbox format:
318
319 grepmail . gnus-mailbox-dir/* > mbox
320
321 Search for all emails to or from an address (taking into account
322 wrapped headers and different header names):
323
324 grepmail -Y '(^TO:|^From:)' my@email.address saved-mail
325
326 Find all emails from postmasters:
327
328 grepmail -Y '^FROM_MAILER:' . saved-mail
329
331 grepmail will not create temporary files while decompressing compressed
332 archives. The last version to do this was 3.5. While the new design
333 uses more memory, the code is much simpler, and there is less chance
334 that email can be read by malicious third parties. Memory usage is
335 determined by the size of the largest email message in the mailbox.
336
338 The MAILDIR environment variable can be used to specify the default
339 mail directory. This directory will be searched if the specified
340 mailbox can not be found directly.
341
342 The HOME environment variable is also used to find mailboxes if they
343 can not be found directly. It is also used to store grepmail state
344 information such as its cache file.
345
347 Patterns containing "$" may cause problems
348 Currently I look for "$" followed by a non-word character and replace
349 it with the line ending for the current file (either "\n" or "\r\n").
350 This may cause problems with complex patterns specified with -E, but
351 I'm not aware of any.
352
353 Mails without bodies cause problems
354 According to RFC 822, mail messages need not have message bodies.
355 I've found and removed one bug related to this. I'm not sure if there
356 are others.
357
358 Complex single-point dates not parsed correctly
359 If you specify a point date like "September 1, 2004", grepmail
360 creates a date range that includes the entire day of September 1,
361 2004. If you specify a complex point date such as "today", "1st
362 Monday in July", or "9/1/2004 at 0:00" grepmail may parse the time
363 incorrectly.
364
365 The reason for this problem is that Date::Manip, as of version 5.42,
366 forces default values for parsed dates and times. This means that
367 grepmail has a hard time determining whether the user supplied
368 certain time/date fields. (e.g. Did Date::Manip provide a default
369 time of 0:00, or did the user specify it?) grepmail tries to work
370 around this problem, but the workaround is inherently incomplete in
371 some rare cases.
372
373 File names that look like flags cause problems.
374 In some special circumstances, grepmail will be confused by files
375 whose names look like flags. In such cases, use the -e flag to
376 specify the search pattern.
377
379 This code is distributed under the GNU General Public License (GPL).
380 See the file LICENSE in the distribution,
381 http://www.opensource.org/gpl-license.html, and
382 http://www.opensource.org/.
383
385 David Coppit <david@coppit.org>
386
388 elm(1), mail(1), grep(1), perl(1), printmail(1), Mail::Internet(3),
389 procmailrc(5). Crocker, D. H., Standard for the Format of Arpa
390 Internet Text Messages, RFC 822.
391
392
393
394perl v5.12.0 2010-06-01 GREPMAIL(1)