1MAIRIX(1) General Commands Manual MAIRIX(1)
2
3
4
6 mairix - index and search mail folders
7
9 Indexing
10 mairix [ -v|--verbose ] [ -p|--purge ] [ -f|--rcfile mairixrc ] [
11 -F|--fast-index ]
12
13
14 Searching
15 mairix [ -v|--verbose ] [ -f|--rcfile mairixrc ] [ -r|--raw-output ] [
16 -x|--excerpt-output ] [ -o|--mfolder mfolder ] [ -a|--augment ] [
17 -t|--threads ] search-patterns
18
19
20 Other
21 mairix [ -h|--help ]
22
23 mairix [ -V|--version ]
24
25 mairix [ -d|--dump ]
26
27
29 mairix indexes and searches a collection of email messages. The fold‐
30 ers containing the messages for indexing are defined in the configura‐
31 tion file. The indexing stage produces a database file. The database
32 file provides rapid access to details of the indexed messages during
33 searching operations. A search normally produces a folder (so-called
34 mfolder) containing the matched messages. However, a raw mode (-r)
35 exists which just lists the matched messages instead.
36
37 It can operate with the following folder types
38
39 * maildir
40
41 * MH (compatible with the MH folder formats used by xmh, sylpheed,
42 claws-mail, nnml (Gnus) and evolution)
43
44 * mbox (including mboxes that have been compressed with gzip or
45 bzip2)
46
47 If maildir or MH source folders are used, and a search outputs its
48 matches to an mfolder in maildir or MH format, symbolic links are used
49 to reference the original messages inside the mfolder. However, if
50 mbox folders are involved, copies of messages are made instead.
51
52
54 mairix decides whether indexing or searching is required by looking for
55 the presence of any search-patterns on the command line.
56
57
58 Special modes
59 -h, --help
60 Show usage summary and exit
61
62
63 -V, --version
64 Show program version and exit
65
66
67 -d
68 Dump the database's contents in human-readable form to stdout.
69
70
71 General options
72 -f mairixrc
73 --rcfile mairixrc
74 Specify an alternative configuration file to use. The default
75 configuration file is ~/.mairixrc.
76
77
78 -v, --verbose
79 Make the output more verbose
80
81
82 -Q, --no-integrity-checks
83 Normally mairix will do some internal integrity tests on the
84 database. The -Q option removes these checks, making mairix run
85 faster, but it will be less likely to detect internal problems
86 if any bugs creep in.
87
88 The nochecks directive in the rc file has the same effect.
89
90
91 --unlock
92 mairix locks its database file during any indexing or searching
93 operation to prevent multiple indexing runs interfering with
94 each other, or an indexing run interfering with search runs.
95 The --unlock option removes the lockfile before doing the
96 requested indexing or searching operation. This is a convenient
97 way of cleaning up a stale lockfile if an earlier run crashed
98 for some reason or was aborted.
99
100
101 Indexing options
102 -p, --purge
103 Cause stale (dead) messages to be purged from the database dur‐
104 ing an indexing run. (Normally, stale messages are left in the
105 database because of the additional cost of compacting away the
106 storage that they take up.)
107
108
109 -F, --fast-index
110 When processing maildir and MH folders, mairix normally compares
111 the mtime and size of each message against the values stored in
112 the database. If they have changed, the message will be res‐
113 canned. This check requires each message file to be stat'ed.
114 For large numbers of messages in these folder types, this can be
115 a sizeable overhead.
116
117 This option tells mairix to assume that when a message currently
118 on-disc has a name matching one already in the database, it
119 should assume the message is unchanged.
120
121 A later indexing run without using this option will fix up any
122 rescans that were missed due to its use.
123
124
125 Search options
126 -a, --augment
127 Append newly matches messages to the current mfolder instead of
128 creating the mfolder from scratch.
129
130
131 -t, --threads
132 As well as returning the matched messages, also return every
133 message in the same thread as one of the real matches.
134
135
136 -r, --raw-output
137 Instead of creating an mfolder containing the matched messages,
138 just show their paths on stdout.
139
140
141 -x, --excerpt-output
142 Instead of creating an mfolder containing the matched messages,
143 display an excerpt from their headers on stdout. The excerpt
144 shows To, Cc, From, Subject and Date.
145
146
147 -o mfolder
148 --mfolder mfolder
149 Specify a temporary alternative path for the mfolder to use,
150 overriding the mfolder directive in the rc file.
151
152 mairix will refuse to output search results into any folder that
153 appears to be amongst those that are indexed. This is to pre‐
154 vent accidental deletion of emails.
155
156
157 Search patterns
158 t:word
159 Match word in the To: header.
160
161
162 c:word
163 Match word in the Cc: header.
164
165
166 f:word
167 Match word in the From: header.
168
169
170 s:word
171 Match word in the Subject: header.
172
173
174 m:word
175 Match word in the Message-ID: header.
176
177
178 b:word
179 Match word in the message body.
180
181 Message body is taken to mean any body part of type text/plain
182 or text/html. For text/html, text within meta tags is ignored.
183 In particular, the URLs inside <A HREF="..."> tags are not cur‐
184 rently indexed. Non-text attachments are ignored. If there's
185 an attachment of type message/rfc822, this is parsed and the
186 match is performed on this sub-message too. If a hit occurs,
187 the enclosing message is treated as having a hit.
188
189
190 d:[start-datespec]-[end-datespec]
191 Match messages with Date: headers lying in the specific range.
192
193
194 z:[low-size]-[high-size]
195 Match messages whose size lies in the specified range. If the
196 low-size argument is omitted it defaults to zero. If the high-
197 size argument is omitted it defaults to infinite size.
198
199 For example, to match messages between 10kilobytes and 20kilo‐
200 bytes in size, the following search term can be used:
201
202 mairix z:10k-20k
203
204
205
206 The suffix 'k' on a number means multiply by 1024, and the suf‐
207 fix 'M' on a number means multiply by 1024*1024.
208
209
210 n:word
211 Match word occurring as the name of an attachment in the mes‐
212 sage. Since attachment names are usually long, this option
213 would usually be used in the substring form. So
214
215 mairix n:mairix=
216
217
218
219 would match all messages which have attachments whose names con‐
220 tain the substring mairix.
221
222 The attachment name is determined from the name=xxx or file‐
223 name=xxx qualifiers on the Content-Type: and Content-Disposi‐
224 tion: headers respectively.
225
226
227 F:flags
228 Match messages with particular flag settings. The available
229 flags are 's' meaning seen, 'r' meaning replied, and 'f' meaning
230 flagged. The flags are case-insensitive. A flag letter may be
231 prefixed by a '-' to negate its sense. Thus
232
233
234 mairix F:-s d:1w-
235
236
237
238 would match any unread message less than a week old, and
239
240
241 mairix F:f-r d:-1m
242
243
244
245 would match any flagged message older than a month which you
246 haven't replied to yet.
247
248 Note that the flag characters and their meanings agree with
249 those used as the suffix letters on message filenames in maildir
250 folders.
251
252
253 Searching for a match amongst more than one part of a message
254 Multiple body parts may be grouped together, if a match in any of them
255 is sought. Common examples follow.
256
257
258 tc:word
259 Match word in either the To: or Cc: headers (or both).
260
261
262 bs:word
263 Match word in either the Subject: header or the message body (or
264 both).
265
266
267 The a: search pattern is an abbreviation for tcf:; i.e. match the word
268 in the To:, Cc: or From: headers. ("a" stands for "address" in this
269 case.)
270
271
272 Match words
273 The word argument to the search strings can take various forms.
274
275
276 ~word
277 Match messages not containing the word.
278
279
280 word1,word2
281 This matches if both the words are matched in the specified mes‐
282 sage part.
283
284
285 word1/word2
286 This matches if either of the words are matched in the specified
287 message part.
288
289
290 substring=
291 Match any word containing substring as a substring
292
293
294 substring=N
295 Match any word containing substring, allowing up to N errors in
296 the match. For example, if N is 1, a single error is allowed,
297 where an error can be
298
299 * a missing letter
300
301 * an extra letter
302
303 * a different letter.
304
305
306 ^substring=
307 Match any word containing substring as a substring, with the
308 requirement that substring occurs at the beginning of the
309 matched word.
310
311
312 Precedence matters
313 The binding order of the constructions is:
314
315
316 1. Individual command line arguments define separate conditions
317 which are AND-ed together
318
319
320 2. Within a single argument, the letters before the colon define
321 which message parts the expression applies to. If there is no
322 colon, the expression applies to all the headers listed earlier
323 and the body.
324
325
326 3. After the colon, commas delineate separate disjuncts, which are
327 OR-ed together.
328
329
330 4. Each disjunct may contain separate conjuncts, which are sepa‐
331 rated by plus signs. These conditions are AND-ed together.
332
333
334 5. Each conjunct may start with a tilde to negate it, and may be
335 followed by a slash to indicate a substring match, optionally
336 followed by an integer to define the maximum number of errors
337 allowed.
338
339
340 Date specification
341 This section describes the syntax used for specifying dates when
342 searching using the `d:' option.
343
344 Dates are specified as a range. The start and end of the range can
345 both be specified. Alternatively, if the start is omitted, it is
346 treated as being the beginning of time. If the end is omitted, it is
347 treated as the current time.
348
349 There are 4 basic formats:
350
351 d:start-end
352 Specify both start and end explicitly
353
354 d:start-
355 Specify start, end is the current time
356
357 d:-end Specify end, start is 'a long time ago' (i.e. early enough to
358 include any message).
359
360 d:period
361 Specify start and end implicitly, as the start and end of the
362 period given.
363
364
365 The start and end can be specified either absolute or relative. A rel‐
366 ative endpoint is given as a number followed by a single letter defin‐
367 ing the scaling:
368
369
370 ┌────────┬─────────────┬───────────┬───────────────────────┐
371 │letter │ short for │ example │ meaning │
372 ├────────┼─────────────┼───────────┼───────────────────────┤
373 │d │ days │ 3d │ 3 days │
374 │w │ weeks │ 2w │ 2 weeks (14 days) │
375 │m │ months │ 5m │ 5 months (150 days) │
376 │y │ years │ 4y │ 4 years (4*365 days) │
377 └────────┴─────────────┴───────────┴───────────────────────┘
378
379 Months are always treated as 30 days, and years as 365 days, for this
380 purpose.
381
382 Absolute times can be specified in many forms. Some forms have differ‐
383 ent meanings when they define a start date from that when they define
384 an end date. Where a single expression specifies both the start and
385 end (i.e. where the argument to d: doesn't contain a `-'), it will usu‐
386 ally have different interpretations in the two cases.
387
388 In the examples below, suppose the current date is Sunday May 18th,
389 2003 (when I started to write this material.)
390
391
392 ┌─────────────────────┬──────────────────────┬───────────────────────┬─────────────────────────────────┐
393 │Example │ Start date │ End date │ Notes │
394 ├─────────────────────┼──────────────────────┼───────────────────────┼─────────────────────────────────┤
395 │d:20030301-20030425 │ March 1st, 2003 │ 25th April, 2003 │ │
396 │d:030301-030425 │ March 1st, 2003 │ April 25th, 2003 │ century assumed │
397 │d:mar1-apr25 │ March 1st, 2003 │ April 25th, 2003 │ │
398 │d:Mar1-Apr25 │ March 1st, 2003 │ April 25th, 2003 │ case insensitive │
399 │d:MAR1-APR25 │ March 1st, 2003 │ April 25th, 2003 │ case insensitive │
400 │d:1mar-25apr │ March 1st, 2003 │ April 25th, 2003 │ date and month in either order │
401 │d:2002 │ January 1st, 2002 │ December 31st, 2002 │ whole year │
402 │d:mar │ March 1st, 2003 │ March 31st, 2003 │ most recent March │
403 │d:oct │ October 1st, 2002 │ October 31st, 2002 │ most recent October │
404 │d:21oct-mar │ October 21st, 2002 │ March 31st, 2003 │ start before end │
405 │d:21apr-mar │ April 21st, 2002 │ March 31st, 2003 │ start before end │
406 │d:21apr- │ April 21st, 2003 │ May 18th, 2003 │ end omitted │
407 │d:-21apr │ January 1st, 1900 │ April 21st, 2003 │ start omitted │
408 │d:6w-2w │ April 6th, 2003 │ May 4th, 2003 │ both dates relative │
409 │d:21apr-1w │ April 21st, 2003 │ May 11th, 2003 │ one date relative │
410 │d:21apr-2y │ April 21st, 2001 │ May 11th, 2001 │ start before end │
411 │d:99-11 │ January 1st, 1999 │ May 11th, 2003 │ 2 digits are a day of the month │
412 │ │ │ │ if possible, otherwise a year │
413 │d:99oct-1oct │ October 1st, 1999 │ October 1st, 2002 │ end before now, single digit is │
414 │ │ │ │ a day of the month │
415 │d:99oct-01oct │ October 1st, 1999 │ October 31st, 2001 │ 2 digits starting with zero │
416 │ │ │ │ treated as a year │
417 │d:oct99-oct1 │ October 1st, 1999 │ October 1st, 2002 │ day and month in either order │
418 │d:oct99-oct01 │ October 1st, 1999 │ October 31st, 2001 │ year and month in either order │
419 └─────────────────────┴──────────────────────┴───────────────────────┴─────────────────────────────────┘
420
421 The principles in the table work as follows.
422
423 · When the expression defines a period of more than a day (i.e. if
424 a month or year is specified), the earliest day in the period is
425 taken when the start date is defined, and the last day in the
426 period if the end of the range is being defined.
427
428 · The end date is always taken to be on or before the current
429 date.
430
431 · The start date is always taken to be on or before the end date.
432
433
435 If the match folder does not exist when running in search mode, it is
436 automatically created. For 'mformat=maildir' (the default), this
437 should be all you need to do. If you use 'mformat=mh', you may have to
438 run some commands before your mailer will recognize the folder. e.g.
439 for mutt, you could do
440
441 mkdir -p /home/richard/Mail/mfolder
442 touch /home/richard/Mail/mfolder/.mh_sequences
443
444 which seems to work. Alternatively, within mutt, you could set
445 MBOX_TYPE to in advance.
446
447 If you use Sylpheed, the best way seems to be to create the new folder
448 from within Sylpheed before letting mairix write into it.
449
450
452 Suppose my email address is <richard@doesnt.exist>.
453
454 Either of the following will match all messages newer than 3 months
455 from me with the word 'chrony' in the subject line:
456
457 mairix d:3m- f:richard+doesnt+exist s:chrony
458 mairix d:3m- f:richard@doesnt.exist s:chrony
459
460 Suppose I don't mind a few spurious matches on the address, I want a
461 wider date range, and I suspect that some messages I replied to might
462 have had the subject keyword spelt wrongly (let's allow up to 2
463 errors):
464
465 mairix d:6m- f:richard s:chrony=2
466
468 mairix works exclusively in terms of words. The index that's built in
469 indexing mode contains a table of which words occur in which messages.
470 Hence, the search capability is based on finding messages that contain
471 particular words. mairix defines a word as any string of alphanumeric
472 characters + underscore. Any whitespace, punctuation, hyphens etc are
473 treated as word boundaries.
474
475 mairix has special handling for the To:, Cc: and From: headers.
476 Besides the normal word scan, these headers are scanned a second time,
477 where the characters '@', '-' and '.' are also treated as word charac‐
478 ters. This allows most (if not all) email addresses to appear in the
479 database as single words. So if you have a mail from wibble@foo‐
480 bar.zzz, it will match on both these searches
481
482
483 mairix f:foobar
484 mairix f:wibble@foobar.zzz
485
486 It should be clear by now that the searching cannot be used to find
487 messages matching general regular expressions. This has never been
488 much of a limitation. Most searches are for particular keywords that
489 were in the messages, or details of the recipients, or the approximate
490 date.
491
492 It's also worth pointing out that there is no 'locality' information
493 stored, so you can't search for messages that have one words 'close' to
494 some other word. For every message and every word, there is a simple
495 yes/no condition stored - whether the message contains the word in a
496 particular header or in the body. So far this has proved to be ade‐
497 quate. mairix has a similar feel to using an Internet search engine.
498
499
501 ~/.mairixrc
502
503
505 Copyright (C) 2002-2006 Richard P. Curnow <rc@rc0.org.uk>
506
508 mairixrc(5)
509
511 We need a plugin scheme to allow more types of attachment to be scanned
512 and indexed.
513
514
515
516
517 January 2006 MAIRIX(1)