1MAIRIX(1) General Commands Manual MAIRIX(1)
2
3
4
6 mairix - index and search mail folders
7
9 Indexing
10 mairix [ -v|--verbose ] [ -p|--purge ] [ -f|--rcfile mairixrc ] [
11 -F|--fast-index ] [ --force-hash-key-new-database hash ]
12
13
14 Searching
15 mairix [ -v|--verbose ] [ -f|--rcfile mairixrc ] [ -r|--raw-output ] [
16 -x|--excerpt-output ] [ -H|--force-hardlinks ] [ -o|--mfolder mfolder ]
17 [ -a|--augment ] [ -t|--threads ] search-patterns
18
19
20 Other
21 mairix [ -h|--help ]
22
23 mairix [ -V|--version ]
24
25 mairix [ -d|--dump ]
26
27
29 mairix indexes and searches a collection of email messages. The fold‐
30 ers containing the messages for indexing are defined in the configura‐
31 tion file. The indexing stage produces a database file. The database
32 file provides rapid access to details of the indexed messages during
33 searching operations. A search normally produces a folder (so-called
34 mfolder) containing the matched messages. However, a raw mode (-r)
35 exists which just lists the matched messages instead.
36
37 It can operate with the following folder types
38
39 * maildir
40
41 * MH (compatible with the MH folder formats used by xmh, sylpheed,
42 claws-mail, nnml (Gnus) and evolution)
43
44 * mbox (including mboxes that have been compressed with gzip or
45 bzip2)
46
47 * IMAP: remote folders on an IMAP server
48
49 If maildir or MH source folders are used, and a search outputs its
50 matches to an mfolder in maildir or MH format, symbolic links are used
51 to reference the original messages inside the mfolder. However, if
52 mbox folders are involved, copies of messages are made instead. If IMAP
53 folders are used for both source results, IMAP server-side copies of
54 messages are made. With IMAP source folders and any other type of
55 results folder, messages are downloaded from the IMAP server to be
56 written to the results folder. With an IMAP results folder and any
57 other type of source folders, messages are uploaded to the IMAP server
58 to be appended to the results folder.
59
61 mairix decides whether indexing or searching is required by looking for
62 the presence of any search-patterns on the command line.
63
64
65 Special modes
66 -h, --help
67 Show usage summary and exit
68
69
70 -V, --version
71 Show program version and exit
72
73
74 -d
75 Dump the database's contents in human-readable form to stdout.
76
77
78 General options
79 -f mairixrc
80 --rcfile mairixrc
81 Specify an alternative configuration file to use. The default
82 configuration file is ~/.mairixrc.
83
84
85 -v, --verbose
86 Make the output more verbose
87
88
89 -Q, --no-integrity-checks
90 Normally mairix will do some internal integrity tests on the
91 database. The -Q option removes these checks, making mairix run
92 faster, but it will be less likely to detect internal problems
93 if any bugs creep in.
94
95 The nochecks directive in the rc file has the same effect.
96
97
98 --unlock
99 mairix locks its database file during any indexing or searching
100 operation to prevent multiple indexing runs interfering with
101 each other, or an indexing run interfering with search runs.
102 The --unlock option removes the lockfile before doing the
103 requested indexing or searching operation. This is a convenient
104 way of cleaning up a stale lockfile if an earlier run crashed
105 for some reason or was aborted.
106
107
108 Indexing options
109 -p, --purge
110 Cause stale (dead) messages to be purged from the database dur‐
111 ing an indexing run. (Normally, stale messages are left in the
112 database because of the additional cost of compacting away the
113 storage that they take up.)
114
115
116 -F, --fast-index
117 When processing maildir and MH folders, mairix normally compares
118 the mtime and size of each message against the values stored in
119 the database. If they have changed, the message will be res‐
120 canned. This check requires each message file to be stat'ed.
121 For large numbers of messages in these folder types, this can be
122 a sizeable overhead.
123
124 This option tells mairix to assume that when a message currently
125 on-disc has a name matching one already in the database, it
126 should assume the message is unchanged.
127
128 A later indexing run without using this option will fix up any
129 rescans that were missed due to its use.
130
131
132 --force-hash-key-new-database hash
133 This option should only be used for debugging.
134 If a new database is created, hash is used as hash key, instead
135 of a random hash.
136
137
138 Search options
139 -a, --augment
140 Append newly matches messages to the current mfolder instead of
141 creating the mfolder from scratch.
142
143
144 -t, --threads
145 As well as returning the matched messages, also return every
146 message in the same thread as one of the real matches.
147
148
149 -r, --raw-output
150 Instead of creating an mfolder containing the matched messages,
151 just show their paths on stdout.
152
153
154 -x, --excerpt-output
155 Instead of creating an mfolder containing the matched messages,
156 display an excerpt from their headers on stdout. The excerpt
157 shows To, Cc, From, Subject and Date. With IMAP source folders,
158 this requires downloading each matched message from the IMAP
159 server.
160
161
162 -H, --force-hardlinks
163 Instead of creating symbolic links, force the use of hardlinks.
164 This helps mailers such as alpine to realize that there are new
165 mails in the search folder.
166
167
168 -o mfolder
169 --mfolder mfolder
170 Specify a temporary alternative path for the mfolder to use,
171 overriding the mfolder directive in the rc file.
172
173 mairix will refuse to output search results into any folder that
174 appears to be amongst those that are indexed. This is to pre‐
175 vent accidental deletion of emails.
176
177
178 Search patterns
179 t:word
180 Match word in the To: header.
181
182
183 c:word
184 Match word in the Cc: header.
185
186
187 f:word
188 Match word in the From: header.
189
190
191 s:word
192 Match word in the Subject: header.
193
194
195 m:word
196 Match word in the Message-ID: header.
197
198
199 b:word
200 Match word in the message body.
201
202 Message body is taken to mean any body part of type text/plain
203 or text/html. For text/html, text within meta tags is ignored.
204 In particular, the URLs inside <A HREF="..."> tags are not cur‐
205 rently indexed. Non-text attachments are ignored. If there's
206 an attachment of type message/rfc822, this is parsed and the
207 match is performed on this sub-message too. If a hit occurs,
208 the enclosing message is treated as having a hit.
209
210
211 d:[start-datespec]-[end-datespec]
212 Match messages with Date: headers lying in the specific range.
213
214
215 z:[low-size]-[high-size]
216 Match messages whose size lies in the specified range. If the
217 low-size argument is omitted it defaults to zero. If the high-
218 size argument is omitted it defaults to infinite size.
219
220 For example, to match messages between 10kilobytes and 20kilo‐
221 bytes in size, the following search term can be used:
222
223 mairix z:10k-20k
224
225
226
227 The suffix 'k' on a number means multiply by 1024, and the suf‐
228 fix 'M' on a number means multiply by 1024*1024.
229
230
231 n:word
232 Match word occurring as the name of an attachment in the mes‐
233 sage. Since attachment names are usually long, this option
234 would usually be used in the substring form. So
235
236 mairix n:mairix=
237
238
239
240 would match all messages which have attachments whose names con‐
241 tain the substring mairix.
242
243 The attachment name is determined from the name=xxx or file‐
244 name=xxx qualifiers on the Content-Type: and Content-Disposi‐
245 tion: headers respectively.
246
247
248 F:flags
249 Match messages with particular flag settings. The available
250 flags are 's' meaning seen, 'r' meaning replied, and 'f' meaning
251 flagged. The flags are case-insensitive. A flag letter may be
252 prefixed by a '-' to negate its sense. Thus
253
254
255 mairix F:-s d:1w-
256
257
258
259 would match any unread message less than a week old, and
260
261
262 mairix F:f-r d:-1m
263
264
265
266 would match any flagged message older than a month which you
267 haven't replied to yet.
268
269 Note that the flag characters and their meanings agree with
270 those used as the suffix letters on message filenames in maildir
271 folders.
272
273
274 Searching for a match amongst more than one part of a message
275 Multiple body parts may be grouped together, if a match in any of them
276 is sought. Common examples follow.
277
278
279 tc:word
280 Match word in either the To: or Cc: headers (or both).
281
282
283 bs:word
284 Match word in either the Subject: header or the message body (or
285 both).
286
287
288 The a: search pattern is an abbreviation for tcf:; i.e. match the word
289 in the To:, Cc: or From: headers. ("a" stands for "address" in this
290 case.)
291
292
293 Match words
294 The word argument to the search strings can take various forms.
295
296
297 ~word
298 Match messages not containing the word.
299
300
301 word1,word2
302 This matches if both the words are matched in the specified mes‐
303 sage part.
304
305
306 word1/word2
307 This matches if either of the words are matched in the specified
308 message part.
309
310
311 substring=
312 Match any word containing substring as a substring
313
314
315 substring=N
316 Match any word containing substring, allowing up to N errors in
317 the match. For example, if N is 1, a single error is allowed,
318 where an error can be
319
320 * a missing letter
321
322 * an extra letter
323
324 * a different letter.
325
326
327 ^substring=
328 Match any word containing substring as a substring, with the
329 requirement that substring occurs at the beginning of the
330 matched word.
331
332
333 Precedence matters
334 The binding order of the constructions is:
335
336
337 1. Individual command line arguments define separate conditions
338 which are AND-ed together
339
340
341 2. Within a single argument, the letters before the colon define
342 which message parts the expression applies to. If there is no
343 colon, the expression applies to all the headers listed earlier
344 and the body.
345
346
347 3. After the colon, slashes delineate separate disjuncts, which are
348 OR-ed together.
349
350
351 4. Each disjunct may contain separate conjuncts, which are sepa‐
352 rated by commas. These conditions are AND-ed together.
353
354
355 5. Each conjunct may start with a tilde to negate it, and may be
356 followed by a slash to indicate a substring match, optionally
357 followed by an integer to define the maximum number of errors
358 allowed.
359
360
361 Date specification
362 This section describes the syntax used for specifying dates when
363 searching using the `d:' option.
364
365 Dates are specified as a range. The start and end of the range can
366 both be specified. Alternatively, if the start is omitted, it is
367 treated as being the beginning of time. If the end is omitted, it is
368 treated as the current time.
369
370 There are 4 basic formats:
371
372 d:start-end
373 Specify both start and end explicitly
374
375 d:start-
376 Specify start, end is the current time
377
378 d:-end Specify end, start is 'a long time ago' (i.e. early enough to
379 include any message).
380
381 d:period
382 Specify start and end implicitly, as the start and end of the
383 period given.
384
385
386 The start and end can be specified either absolute or relative. A rel‐
387 ative endpoint is given as a number followed by a single letter defin‐
388 ing the scaling:
389
390
391 ┌────────┬─────────────┬───────────┬───────────────────────┐
392 │letter │ short for │ example │ meaning │
393 ├────────┼─────────────┼───────────┼───────────────────────┤
394 │d │ days │ 3d │ 3 days │
395 │w │ weeks │ 2w │ 2 weeks (14 days) │
396 │m │ months │ 5m │ 5 months (150 days) │
397 │y │ years │ 4y │ 4 years (4*365 days) │
398 └────────┴─────────────┴───────────┴───────────────────────┘
399
400 Months are always treated as 30 days, and years as 365 days, for this
401 purpose.
402
403 Absolute times can be specified in many forms. Some forms have differ‐
404 ent meanings when they define a start date from that when they define
405 an end date. Where a single expression specifies both the start and
406 end (i.e. where the argument to d: doesn't contain a `-'), it will usu‐
407 ally have different interpretations in the two cases.
408
409 In the examples below, suppose the current date is Sunday May 18th,
410 2003 (when I started to write this material.)
411
412
413 ┌─────────────────────┬──────────────────────┬───────────────────────┬─────────────────────────────────┐
414 │Example │ Start date │ End date │ Notes │
415 ├─────────────────────┼──────────────────────┼───────────────────────┼─────────────────────────────────┤
416 │d:20030301-20030425 │ March 1st, 2003 │ 25th April, 2003 │ │
417 │d:030301-030425 │ March 1st, 2003 │ April 25th, 2003 │ century assumed │
418 │d:mar1-apr25 │ March 1st, 2003 │ April 25th, 2003 │ │
419 │d:Mar1-Apr25 │ March 1st, 2003 │ April 25th, 2003 │ case insensitive │
420 │d:MAR1-APR25 │ March 1st, 2003 │ April 25th, 2003 │ case insensitive │
421 │d:1mar-25apr │ March 1st, 2003 │ April 25th, 2003 │ date and month in either order │
422 │d:2002 │ January 1st, 2002 │ December 31st, 2002 │ whole year │
423 │d:mar │ March 1st, 2003 │ March 31st, 2003 │ most recent March │
424 │d:oct │ October 1st, 2002 │ October 31st, 2002 │ most recent October │
425 │d:21oct-mar │ October 21st, 2002 │ March 31st, 2003 │ start before end │
426 │d:21apr-mar │ April 21st, 2002 │ March 31st, 2003 │ start before end │
427 │d:21apr- │ April 21st, 2003 │ May 18th, 2003 │ end omitted │
428 │d:-21apr │ January 1st, 1900 │ April 21st, 2003 │ start omitted │
429 │d:6w-2w │ April 6th, 2003 │ May 4th, 2003 │ both dates relative │
430 │d:21apr-1w │ April 21st, 2003 │ May 11th, 2003 │ one date relative │
431 │d:21apr-2y │ April 21st, 2001 │ May 11th, 2001 │ start before end │
432 │d:99-11 │ January 1st, 1999 │ May 11th, 2003 │ 2 digits are a day of the month │
433 │ │ │ │ if possible, otherwise a year │
434 │d:99oct-1oct │ October 1st, 1999 │ October 1st, 2002 │ end before now, single digit is │
435 │ │ │ │ a day of the month │
436 │d:99oct-01oct │ October 1st, 1999 │ October 31st, 2001 │ 2 digits starting with zero │
437 │ │ │ │ treated as a year │
438 │d:oct99-oct1 │ October 1st, 1999 │ October 1st, 2002 │ day and month in either order │
439 │d:oct99-oct01 │ October 1st, 1999 │ October 31st, 2001 │ year and month in either order │
440 └─────────────────────┴──────────────────────┴───────────────────────┴─────────────────────────────────┘
441
442 The principles in the table work as follows.
443
444 · When the expression defines a period of more than a day (i.e. if
445 a month or year is specified), the earliest day in the period is
446 taken when the start date is defined, and the last day in the
447 period if the end of the range is being defined.
448
449 · The end date is always taken to be on or before the current
450 date.
451
452 · The start date is always taken to be on or before the end date.
453
454
456 If the match folder does not exist when running in search mode, it is
457 automatically created. For 'mformat=maildir' (the default), this
458 should be all you need to do. If you use 'mformat=mh', you may have to
459 run some commands before your mailer will recognize the folder. e.g.
460 for mutt, you could do
461
462 mkdir -p /home/richard/Mail/mfolder
463 touch /home/richard/Mail/mfolder/.mh_sequences
464
465 which seems to work. Alternatively, within mutt, you could set
466 MBOX_TYPE to 'mh' and save a message to '+mfolder' to have mutt set up
467 the structure for you in advance.
468
469 If you use Sylpheed, the best way seems to be to create the new folder
470 from within Sylpheed before letting mairix write into it.
471
472
474 Suppose my email address is <richard@doesnt.exist>.
475
476 Either of the following will match all messages newer than 3 months
477 from me with the word 'chrony' in the subject line:
478
479 mairix d:3m- f:richard+doesnt+exist s:chrony
480 mairix d:3m- f:richard@doesnt.exist s:chrony
481
482 Suppose I don't mind a few spurious matches on the address, I want a
483 wider date range, and I suspect that some messages I replied to might
484 have had the subject keyword spelt wrongly (let's allow up to 2
485 errors):
486
487 mairix d:6m- f:richard s:chrony=2
488
490 mairix works exclusively in terms of words. The index that's built in
491 indexing mode contains a table of which words occur in which messages.
492 Hence, the search capability is based on finding messages that contain
493 particular words. mairix defines a word as any string of alphanumeric
494 characters + underscore. Any whitespace, punctuation, hyphens etc are
495 treated as word boundaries.
496
497 mairix has special handling for the To:, Cc: and From: headers.
498 Besides the normal word scan, these headers are scanned a second time,
499 where the characters '@', '-' and '.' are also treated as word charac‐
500 ters. This allows most (if not all) email addresses to appear in the
501 database as single words. So if you have a mail from wibble@foo‐
502 bar.zzz, it will match on both these searches
503
504
505 mairix f:foobar
506 mairix f:wibble@foobar.zzz
507
508 It should be clear by now that the searching cannot be used to find
509 messages matching general regular expressions. This has never been
510 much of a limitation. Most searches are for particular keywords that
511 were in the messages, or details of the recipients, or the approximate
512 date.
513
514 It's also worth pointing out that there is no 'locality' information
515 stored, so you can't search for messages that have one words 'close' to
516 some other word. For every message and every word, there is a simple
517 yes/no condition stored - whether the message contains the word in a
518 particular header or in the body. So far this has proved to be ade‐
519 quate. mairix has a similar feel to using an Internet search engine.
520
521
523 ~/.mairixrc
524
525
527 Copyright (C) 2002-2006 Richard P. Curnow <rc@rc0.org.uk>
528
530 mairixrc(5)
531
533 We need a plugin scheme to allow more types of attachment to be scanned
534 and indexed.
535
536
537
538
539 January 2006 MAIRIX(1)