1MAIRIX(1) General Commands Manual MAIRIX(1)
2
3
4
6 mairix - index and search mail folders
7
9 Indexing
10 mairix [ -v|--verbose ] [ -p|--purge ] [ -f|--rcfile mairixrc ] [
11 -F|--fast-index ] [ --force-hash-key-new-database hash ]
12
13
14 Searching
15 mairix [ -v|--verbose ] [ -f|--rcfile mairixrc ] [ -r|--raw-output ] [
16 -x|--excerpt-output ] [ -H|--force-hardlinks ] [ -o|--mfolder mfolder ]
17 [ -a|--augment ] [ -t|--threads ] search-patterns
18
19
20 Other
21 mairix [ -h|--help ]
22
23 mairix [ -V|--version ]
24
25 mairix [ -d|--dump ]
26
27
29 mairix indexes and searches a collection of email messages. The fold‐
30 ers containing the messages for indexing are defined in the configura‐
31 tion file. The indexing stage produces a database file. The database
32 file provides rapid access to details of the indexed messages during
33 searching operations. A search normally produces a folder (so-called
34 mfolder) containing the matched messages. However, a raw mode (-r) ex‐
35 ists which just lists the matched messages instead.
36
37 It can operate with the following folder types
38
39 * maildir
40
41 * MH (compatible with the MH folder formats used by xmh, sylpheed,
42 claws-mail, nnml (Gnus) and evolution)
43
44 * mbox (including mboxes that have been compressed with gzip or
45 bzip2)
46
47 * MMDF (including those compressed with gzip or bzip2)
48
49 * IMAP: remote folders on an IMAP server
50
51 If maildir or MH source folders are used, and a search outputs its
52 matches to an mfolder in maildir or MH format, symbolic links are used
53 to reference the original messages inside the mfolder. However, if
54 mbox or MMDF folders are involved, copies of messages are made instead.
55 If IMAP folders are used for both source results, IMAP server-side
56 copies of messages are made. With IMAP source folders and any other
57 type of results folder, messages are downloaded from the IMAP server to
58 be written to the results folder. With an IMAP results folder and any
59 other type of source folders, messages are uploaded to the IMAP server
60 to be appended to the results folder.
61
63 mairix decides whether indexing or searching is required by looking for
64 the presence of any search-patterns on the command line.
65
66
67 Special modes
68 -h, --help
69 Show usage summary and exit
70
71
72 -V, --version
73 Show program version and exit
74
75
76 -d
77 Dump the database's contents in human-readable form to stdout.
78
79
80 General options
81 -f mairixrc
82 --rcfile mairixrc
83 Specify an alternative configuration file to use. The default
84 configuration file is ~/.mairixrc.
85
86
87 -v, --verbose
88 Make the output more verbose
89
90
91 -Q, --no-integrity-checks
92 Normally mairix will do some internal integrity tests on the
93 database. The -Q option removes these checks, making mairix run
94 faster, but it will be less likely to detect internal problems
95 if any bugs creep in.
96
97 The nochecks directive in the rc file has the same effect.
98
99
100 --unlock
101 mairix locks its database file during any indexing or searching
102 operation to prevent multiple indexing runs interfering with
103 each other, or an indexing run interfering with search runs.
104 The --unlock option removes the lockfile before doing the re‐
105 quested indexing or searching operation. This is a convenient
106 way of cleaning up a stale lockfile if an earlier run crashed
107 for some reason or was aborted.
108
109
110 Indexing options
111 -p, --purge
112 Cause stale (dead) messages to be purged from the database dur‐
113 ing an indexing run. (Normally, stale messages are left in the
114 database because of the additional cost of compacting away the
115 storage that they take up.)
116
117
118 -F, --fast-index
119 When processing maildir and MH folders, mairix normally compares
120 the mtime and size of each message against the values stored in
121 the database. If they have changed, the message will be res‐
122 canned. This check requires each message file to be stat'ed.
123 For large numbers of messages in these folder types, this can be
124 a sizeable overhead.
125
126 This option tells mairix to assume that when a message currently
127 on-disc has a name matching one already in the database, it
128 should assume the message is unchanged.
129
130 A later indexing run without using this option will fix up any
131 rescans that were missed due to its use.
132
133
134 --force-hash-key-new-database hash
135 This option should only be used for debugging.
136 If a new database is created, hash is used as hash key, instead
137 of a random hash.
138
139
140 Search options
141 -a, --augment
142 Append newly matches messages to the current mfolder instead of
143 creating the mfolder from scratch.
144
145
146 -t, --threads
147 As well as returning the matched messages, also return every
148 message in the same thread as one of the real matches.
149
150
151 -r, --raw-output
152 Instead of creating an mfolder containing the matched messages,
153 just show their paths on stdout.
154
155
156 -x, --excerpt-output
157 Instead of creating an mfolder containing the matched messages,
158 display an excerpt from their headers on stdout. The excerpt
159 shows To, Cc, From, Subject and Date. With IMAP source folders,
160 this requires downloading each matched message from the IMAP
161 server.
162
163
164 -H, --force-hardlinks
165 Instead of creating symbolic links, force the use of hardlinks.
166 This helps mailers such as alpine to realize that there are new
167 mails in the search folder.
168
169
170 -o mfolder
171 --mfolder mfolder
172 Specify a temporary alternative path for the mfolder to use,
173 overriding the mfolder directive in the rc file.
174
175 mairix will refuse to output search results into any folder that
176 appears to be amongst those that are indexed. This is to pre‐
177 vent accidental deletion of emails.
178
179
180 Search patterns
181 t:word
182 Match word in the To: header.
183
184
185 c:word
186 Match word in the Cc: header.
187
188
189 f:word
190 Match word in the From: header.
191
192
193 s:word
194 Match word in the Subject: header.
195
196
197 m:word
198 Match word in the Message-ID: header.
199
200
201 b:word
202 Match word in the message body.
203
204 Message body is taken to mean any body part of type text/plain
205 or text/html. For text/html, text within meta tags is ignored.
206 In particular, the URLs inside <A HREF="..."> tags are not cur‐
207 rently indexed. Non-text attachments are ignored. If there's
208 an attachment of type message/rfc822, this is parsed and the
209 match is performed on this sub-message too. If a hit occurs,
210 the enclosing message is treated as having a hit.
211
212
213 d:[start-datespec]-[end-datespec]
214 Match messages with Date: headers lying in the specific range.
215
216
217 z:[low-size]-[high-size]
218 Match messages whose size lies in the specified range. If the
219 low-size argument is omitted it defaults to zero. If the high-
220 size argument is omitted it defaults to infinite size.
221
222 For example, to match messages between 10kilobytes and 20kilo‐
223 bytes in size, the following search term can be used:
224
225 mairix z:10k-20k
226
227
228
229 The suffix 'k' on a number means multiply by 1024, and the suf‐
230 fix 'M' on a number means multiply by 1024*1024.
231
232
233 n:word
234 Match word occurring as the name of an attachment in the mes‐
235 sage. Since attachment names are usually long, this option
236 would usually be used in the substring form. So
237
238 mairix n:mairix=
239
240
241
242 would match all messages which have attachments whose names con‐
243 tain the substring mairix.
244
245 The attachment name is determined from the name=xxx or file‐
246 name=xxx qualifiers on the Content-Type: and Content-Disposi‐
247 tion: headers respectively.
248
249
250 F:flags
251 Match messages with particular flag settings. The available
252 flags are 's' meaning seen, 'r' meaning replied, and 'f' meaning
253 flagged. The flags are case-insensitive. A flag letter may be
254 prefixed by a '-' to negate its sense. Thus
255
256
257 mairix F:-s d:1w-
258
259
260
261 would match any unread message less than a week old, and
262
263
264 mairix F:f-r d:-1m
265
266
267
268 would match any flagged message older than a month which you
269 haven't replied to yet.
270
271 Note that the flag characters and their meanings agree with
272 those used as the suffix letters on message filenames in maildir
273 folders.
274
275
276 Searching for a match amongst more than one part of a message
277 Multiple body parts may be grouped together, if a match in any of them
278 is sought. Common examples follow.
279
280
281 tc:word
282 Match word in either the To: or Cc: headers (or both).
283
284
285 bs:word
286 Match word in either the Subject: header or the message body (or
287 both).
288
289
290 The a: search pattern is an abbreviation for tcf:; i.e. match the word
291 in the To:, Cc: or From: headers. ("a" stands for "address" in this
292 case.)
293
294
295 Match words
296 The word argument to the search strings can take various forms.
297
298
299 ~word
300 Match messages not containing the word.
301
302
303 word1,word2
304 This matches if both the words are matched in the specified mes‐
305 sage part.
306
307
308 word1/word2
309 This matches if either of the words are matched in the specified
310 message part.
311
312
313 substring=
314 Match any word containing substring as a substring
315
316
317 substring=N
318 Match any word containing substring, allowing up to N errors in
319 the match. For example, if N is 1, a single error is allowed,
320 where an error can be
321
322 * a missing letter
323
324 * an extra letter
325
326 * a different letter.
327
328
329 ^substring=
330 Match any word containing substring as a substring, with the re‐
331 quirement that substring occurs at the beginning of the matched
332 word.
333
334
335 Precedence matters
336 The binding order of the constructions is:
337
338
339 1. Individual command line arguments define separate conditions
340 which are AND-ed together
341
342
343 2. Within a single argument, the letters before the colon define
344 which message parts the expression applies to. If there is no
345 colon, the expression applies to all the headers listed earlier
346 and the body.
347
348
349 3. After the colon, slashes delineate separate disjuncts, which are
350 OR-ed together.
351
352
353 4. Each disjunct may contain separate conjuncts, which are sepa‐
354 rated by commas. These conditions are AND-ed together.
355
356
357 5. Each conjunct may start with a tilde to negate it, and may be
358 followed by a slash to indicate a substring match, optionally
359 followed by an integer to define the maximum number of errors
360 allowed.
361
362
363 Date specification
364 This section describes the syntax used for specifying dates when
365 searching using the `d:' option.
366
367 Dates are specified as a range. The start and end of the range can
368 both be specified. Alternatively, if the start is omitted, it is
369 treated as being the beginning of time. If the end is omitted, it is
370 treated as the current time.
371
372 There are 4 basic formats:
373
374 d:start-end
375 Specify both start and end explicitly
376
377 d:start-
378 Specify start, end is the current time
379
380 d:-end Specify end, start is 'a long time ago' (i.e. early enough to
381 include any message).
382
383 d:period
384 Specify start and end implicitly, as the start and end of the
385 period given.
386
387
388 The start and end can be specified either absolute or relative. A rel‐
389 ative endpoint is given as a number followed by a single letter defin‐
390 ing the scaling:
391
392
393 ┌────────┬─────────────┬───────────┬───────────────────────┐
394 │letter │ short for │ example │ meaning │
395 ├────────┼─────────────┼───────────┼───────────────────────┤
396 │d │ days │ 3d │ 3 days │
397 │w │ weeks │ 2w │ 2 weeks (14 days) │
398 │m │ months │ 5m │ 5 months (150 days) │
399 │y │ years │ 4y │ 4 years (4*365 days) │
400 └────────┴─────────────┴───────────┴───────────────────────┘
401
402 Months are always treated as 30 days, and years as 365 days, for this
403 purpose.
404
405 Absolute times can be specified in many forms. Some forms have differ‐
406 ent meanings when they define a start date from that when they define
407 an end date. Where a single expression specifies both the start and
408 end (i.e. where the argument to d: doesn't contain a `-'), it will usu‐
409 ally have different interpretations in the two cases.
410
411 In the examples below, suppose the current date is Sunday May 18th,
412 2003 (when I started to write this material.)
413
414
415 ┌─────────────────────┬──────────────────────┬───────────────────────┬─────────────────────────────────┐
416 │Example │ Start date │ End date │ Notes │
417 ├─────────────────────┼──────────────────────┼───────────────────────┼─────────────────────────────────┤
418 │d:20030301-20030425 │ March 1st, 2003 │ 25th April, 2003 │ │
419 │d:030301-030425 │ March 1st, 2003 │ April 25th, 2003 │ century assumed │
420 │d:mar1-apr25 │ March 1st, 2003 │ April 25th, 2003 │ │
421 │d:Mar1-Apr25 │ March 1st, 2003 │ April 25th, 2003 │ case insensitive │
422 │d:MAR1-APR25 │ March 1st, 2003 │ April 25th, 2003 │ case insensitive │
423 │d:1mar-25apr │ March 1st, 2003 │ April 25th, 2003 │ date and month in either order │
424 │d:2002 │ January 1st, 2002 │ December 31st, 2002 │ whole year │
425 │d:mar │ March 1st, 2003 │ March 31st, 2003 │ most recent March │
426 │d:oct │ October 1st, 2002 │ October 31st, 2002 │ most recent October │
427 │d:21oct-mar │ October 21st, 2002 │ March 31st, 2003 │ start before end │
428 │d:21apr-mar │ April 21st, 2002 │ March 31st, 2003 │ start before end │
429 │d:21apr- │ April 21st, 2003 │ May 18th, 2003 │ end omitted │
430 │d:-21apr │ January 1st, 1900 │ April 21st, 2003 │ start omitted │
431 │d:6w-2w │ April 6th, 2003 │ May 4th, 2003 │ both dates relative │
432 │d:21apr-1w │ April 21st, 2003 │ May 11th, 2003 │ one date relative │
433 │d:21apr-2y │ April 21st, 2001 │ May 11th, 2001 │ start before end │
434 │d:99-11 │ January 1st, 1999 │ May 11th, 2003 │ 2 digits are a day of the month │
435 │ │ │ │ if possible, otherwise a year │
436 │d:99oct-1oct │ October 1st, 1999 │ October 1st, 2002 │ end before now, single digit is │
437 │ │ │ │ a day of the month │
438 │d:99oct-01oct │ October 1st, 1999 │ October 31st, 2001 │ 2 digits starting with zero │
439 │ │ │ │ treated as a year │
440 │d:oct99-oct1 │ October 1st, 1999 │ October 1st, 2002 │ day and month in either order │
441 │d:oct99-oct01 │ October 1st, 1999 │ October 31st, 2001 │ year and month in either order │
442 └─────────────────────┴──────────────────────┴───────────────────────┴─────────────────────────────────┘
443
444 The principles in the table work as follows.
445
446 • When the expression defines a period of more than a day (i.e. if
447 a month or year is specified), the earliest day in the period is
448 taken when the start date is defined, and the last day in the
449 period if the end of the range is being defined.
450
451 • The end date is always taken to be on or before the current
452 date.
453
454 • The start date is always taken to be on or before the end date.
455
456
458 If the match folder does not exist when running in search mode, it is
459 automatically created. For 'mformat=maildir' (the default), this
460 should be all you need to do. If you use 'mformat=mh', you may have to
461 run some commands before your mailer will recognize the folder. e.g.
462 for mutt, you could do
463
464 mkdir -p /home/richard/Mail/mfolder
465 touch /home/richard/Mail/mfolder/.mh_sequences
466
467 which seems to work. Alternatively, within mutt, you could set
468 MBOX_TYPE to 'mh' and save a message to '+mfolder' to have mutt set up
469 the structure for you in advance.
470
471 If you use Sylpheed, the best way seems to be to create the new folder
472 from within Sylpheed before letting mairix write into it.
473
474
476 Suppose my email address is <richard@doesnt.exist>.
477
478 Either of the following will match all messages newer than 3 months
479 from me with the word 'chrony' in the subject line:
480
481 mairix d:3m- f:richard+doesnt+exist s:chrony
482 mairix d:3m- f:richard@doesnt.exist s:chrony
483
484 Suppose I don't mind a few spurious matches on the address, I want a
485 wider date range, and I suspect that some messages I replied to might
486 have had the subject keyword spelt wrongly (let's allow up to 2 er‐
487 rors):
488
489 mairix d:6m- f:richard s:chrony=2
490
492 mairix works exclusively in terms of words. The index that's built in
493 indexing mode contains a table of which words occur in which messages.
494 Hence, the search capability is based on finding messages that contain
495 particular words. mairix defines a word as any string of alphanumeric
496 characters + underscore. Any whitespace, punctuation, hyphens etc are
497 treated as word boundaries.
498
499 mairix has special handling for the To:, Cc: and From: headers. Be‐
500 sides the normal word scan, these headers are scanned a second time,
501 where the characters '@', '-' and '.' are also treated as word charac‐
502 ters. This allows most (if not all) email addresses to appear in the
503 database as single words. So if you have a mail from wibble@foo‐
504 bar.zzz, it will match on both these searches
505
506
507 mairix f:foobar
508 mairix f:wibble@foobar.zzz
509
510 It should be clear by now that the searching cannot be used to find
511 messages matching general regular expressions. This has never been
512 much of a limitation. Most searches are for particular keywords that
513 were in the messages, or details of the recipients, or the approximate
514 date.
515
516 It's also worth pointing out that there is no 'locality' information
517 stored, so you can't search for messages that have one words 'close' to
518 some other word. For every message and every word, there is a simple
519 yes/no condition stored - whether the message contains the word in a
520 particular header or in the body. So far this has proved to be ade‐
521 quate. mairix has a similar feel to using an Internet search engine.
522
523
525 ~/.mairixrc
526
527
529 Copyright (C) 2002-2006 Richard P. Curnow <rc@rc0.org.uk>
530
532 mairixrc(5)
533
535 We need a plugin scheme to allow more types of attachment to be scanned
536 and indexed.
537
538
539
540
541 January 2006 MAIRIX(1)