1MAIRIX(1)                   General Commands Manual                  MAIRIX(1)
2
3
4

NAME

6       mairix - index and search mail folders
7

SYNOPSIS

9   Indexing
10       mairix  [  -v|--verbose  ]  [  -p|--purge  ] [ -f|--rcfile mairixrc ] [
11       -F|--fast-index ] [ --force-hash-key-new-database hash ]
12
13
14   Searching
15       mairix [ -v|--verbose ] [ -f|--rcfile mairixrc ] [ -r|--raw-output ]  [
16       -x|--excerpt-output ] [ -H|--force-hardlinks ] [ -o|--mfolder mfolder ]
17       [ -a|--augment ] [ -t|--threads ] search-patterns
18
19
20   Other
21       mairix [ -h|--help ]
22
23       mairix [ -V|--version ]
24
25       mairix [ -d|--dump ]
26
27

DESCRIPTION

29       mairix indexes and searches a collection of email messages.  The  fold‐
30       ers  containing the messages for indexing are defined in the configura‐
31       tion file.  The indexing stage produces a database file.  The  database
32       file  provides  rapid  access to details of the indexed messages during
33       searching operations.  A search normally produces a  folder  (so-called
34       mfolder)  containing  the  matched  messages.  However, a raw mode (-r)
35       exists which just lists the matched messages instead.
36
37       It can operate with the following folder types
38
39       *      maildir
40
41       *      MH (compatible with the MH folder formats used by xmh, sylpheed,
42              claws-mail, nnml (Gnus) and evolution)
43
44       *      mbox  (including  mboxes  that have been compressed with gzip or
45              bzip2)
46
47       *      IMAP: remote folders on an IMAP server
48
49       If maildir or MH source folders are used,  and  a  search  outputs  its
50       matches  to an mfolder in maildir or MH format, symbolic links are used
51       to reference the original messages inside  the  mfolder.   However,  if
52       mbox folders are involved, copies of messages are made instead. If IMAP
53       folders are used for both source results, IMAP  server-side  copies  of
54       messages  are  made.  With  IMAP  source  folders and any other type of
55       results folder, messages are downloaded from  the  IMAP  server  to  be
56       written  to  the  results  folder.  With an IMAP results folder and any
57       other type of source folders, messages are uploaded to the IMAP  server
58       to be appended to the results folder.
59

OPTIONS

61       mairix decides whether indexing or searching is required by looking for
62       the presence of any search-patterns on the command line.
63
64
65   Special modes
66       -h, --help
67              Show usage summary and exit
68
69
70       -V, --version
71              Show program version and exit
72
73
74       -d
75              Dump the database's contents in human-readable form to stdout.
76
77
78   General options
79       -f mairixrc
80       --rcfile mairixrc
81              Specify an alternative configuration file to use.   The  default
82              configuration file is ~/.mairixrc.
83
84
85       -v, --verbose
86              Make the output more verbose
87
88
89       -Q, --no-integrity-checks
90              Normally  mairix  will  do  some internal integrity tests on the
91              database.  The -Q option removes these checks, making mairix run
92              faster,  but  it will be less likely to detect internal problems
93              if any bugs creep in.
94
95              The nochecks directive in the rc file has the same effect.
96
97
98       --unlock
99              mairix locks its database file during any indexing or  searching
100              operation  to  prevent  multiple  indexing runs interfering with
101              each other, or an indexing run  interfering  with  search  runs.
102              The  --unlock  option  removes  the  lockfile  before  doing the
103              requested indexing or searching operation.  This is a convenient
104              way  of  cleaning  up a stale lockfile if an earlier run crashed
105              for some reason or was aborted.
106
107
108   Indexing options
109       -p, --purge
110              Cause stale (dead) messages to be purged from the database  dur‐
111              ing  an indexing run.  (Normally, stale messages are left in the
112              database because of the additional cost of compacting  away  the
113              storage that they take up.)
114
115
116       -F, --fast-index
117              When processing maildir and MH folders, mairix normally compares
118              the mtime and size of each message against the values stored  in
119              the  database.   If  they have changed, the message will be res‐
120              canned.  This check requires each message file  to  be  stat'ed.
121              For large numbers of messages in these folder types, this can be
122              a sizeable overhead.
123
124              This option tells mairix to assume that when a message currently
125              on-disc  has  a  name  matching  one already in the database, it
126              should assume the message is unchanged.
127
128              A later indexing run without using this option will fix  up  any
129              rescans that were missed due to its use.
130
131
132       --force-hash-key-new-database hash
133              This option should only be used for debugging.
134              If  a new database is created, hash is used as hash key, instead
135              of a random hash.
136
137
138   Search options
139       -a, --augment
140              Append newly matches messages to the current mfolder instead  of
141              creating the mfolder from scratch.
142
143
144       -t, --threads
145              As  well  as  returning  the matched messages, also return every
146              message in the same thread as one of the real matches.
147
148
149       -r, --raw-output
150              Instead of creating an mfolder containing the matched  messages,
151              just show their paths on stdout.
152
153
154       -x, --excerpt-output
155              Instead  of creating an mfolder containing the matched messages,
156              display an excerpt from their headers on  stdout.   The  excerpt
157              shows  To, Cc, From, Subject and Date. With IMAP source folders,
158              this requires downloading each matched  message  from  the  IMAP
159              server.
160
161
162       -H, --force-hardlinks
163              Instead  of creating symbolic links, force the use of hardlinks.
164              This helps mailers such as alpine to realize that there are  new
165              mails in the search folder.
166
167
168       -o mfolder
169       --mfolder mfolder
170              Specify  a  temporary  alternative  path for the mfolder to use,
171              overriding the mfolder directive in the rc file.
172
173              mairix will refuse to output search results into any folder that
174              appears  to  be amongst those that are indexed.  This is to pre‐
175              vent accidental deletion of emails.
176
177
178   Search patterns
179       t:word
180              Match word in the To: header.
181
182
183       c:word
184              Match word in the Cc: header.
185
186
187       f:word
188              Match word in the From: header.
189
190
191       s:word
192              Match word in the Subject: header.
193
194
195       m:word
196              Match word in the Message-ID: header.
197
198
199       b:word
200              Match word in the message body.
201
202              Message body is taken to mean any body part of  type  text/plain
203              or  text/html.  For text/html, text within meta tags is ignored.
204              In particular, the URLs inside <A HREF="..."> tags are not  cur‐
205              rently  indexed.   Non-text attachments are ignored.  If there's
206              an attachment of type message/rfc822, this  is  parsed  and  the
207              match  is  performed  on this sub-message too.  If a hit occurs,
208              the enclosing message is treated as having a hit.
209
210
211       d:[start-datespec]-[end-datespec]
212              Match messages with Date: headers lying in the specific range.
213
214
215       z:[low-size]-[high-size]
216              Match messages whose size lies in the specified range.   If  the
217              low-size  argument is omitted it defaults to zero.  If the high-
218              size argument is omitted it defaults to infinite size.
219
220              For example, to match messages between 10kilobytes  and  20kilo‐
221              bytes in size, the following search term can be used:
222
223                   mairix z:10k-20k
224
225
226
227              The  suffix 'k' on a number means multiply by 1024, and the suf‐
228              fix 'M' on a number means multiply by 1024*1024.
229
230
231       n:word
232              Match word occurring as the name of an attachment  in  the  mes‐
233              sage.   Since  attachment  names  are  usually long, this option
234              would usually be used in the substring form.  So
235
236                   mairix n:mairix=
237
238
239
240              would match all messages which have attachments whose names con‐
241              tain the substring mairix.
242
243              The  attachment  name  is  determined from the name=xxx or file‐
244              name=xxx qualifiers on the  Content-Type:  and  Content-Disposi‐
245              tion: headers respectively.
246
247
248       F:flags
249              Match  messages  with  particular  flag settings.  The available
250              flags are 's' meaning seen, 'r' meaning replied, and 'f' meaning
251              flagged.   The flags are case-insensitive.  A flag letter may be
252              prefixed by a '-' to negate its sense.  Thus
253
254
255                   mairix F:-s d:1w-
256
257
258
259              would match any unread message less than a week old, and
260
261
262                   mairix F:f-r d:-1m
263
264
265
266              would match any flagged message older than  a  month  which  you
267              haven't replied to yet.
268
269              Note  that  the  flag  characters  and their meanings agree with
270              those used as the suffix letters on message filenames in maildir
271              folders.
272
273
274   Searching for a match amongst more than one part of a message
275       Multiple  body parts may be grouped together, if a match in any of them
276       is sought.  Common examples follow.
277
278
279       tc:word
280              Match word in either the To: or Cc: headers (or both).
281
282
283       bs:word
284              Match word in either the Subject: header or the message body (or
285              both).
286
287
288       The  a: search pattern is an abbreviation for tcf:; i.e. match the word
289       in the To:, Cc: or From: headers.  ("a" stands for  "address"  in  this
290       case.)
291
292
293   Match words
294       The word argument to the search strings can take various forms.
295
296
297       ~word
298              Match messages not containing the word.
299
300
301       word1,word2
302              This matches if both the words are matched in the specified mes‐
303              sage part.
304
305
306       word1/word2
307              This matches if either of the words are matched in the specified
308              message part.
309
310
311       substring=
312              Match any word containing substring as a substring
313
314
315       substring=N
316              Match  any word containing substring, allowing up to N errors in
317              the match.  For example, if N is 1, a single error  is  allowed,
318              where an error can be
319
320       *      a missing letter
321
322       *      an extra letter
323
324       *      a different letter.
325
326
327       ^substring=
328              Match  any  word  containing  substring as a substring, with the
329              requirement that  substring  occurs  at  the  beginning  of  the
330              matched word.
331
332
333   Precedence matters
334       The binding order of the constructions is:
335
336
337       1.     Individual  command  line  arguments  define separate conditions
338              which are AND-ed together
339
340
341       2.     Within a single argument, the letters before  the  colon  define
342              which  message  parts the expression applies to.  If there is no
343              colon, the expression applies to all the headers listed  earlier
344              and the body.
345
346
347       3.     After the colon, slashes delineate separate disjuncts, which are
348              OR-ed together.
349
350
351       4.     Each disjunct may contain separate conjuncts,  which  are  sepa‐
352              rated by commas.  These conditions are AND-ed together.
353
354
355       5.     Each  conjunct  may  start with a tilde to negate it, and may be
356              followed by a slash to indicate a  substring  match,  optionally
357              followed  by  an  integer to define the maximum number of errors
358              allowed.
359
360
361   Date specification
362       This section describes  the  syntax  used  for  specifying  dates  when
363       searching using the `d:' option.
364
365       Dates  are  specified  as  a range.  The start and end of the range can
366       both be specified.  Alternatively, if  the  start  is  omitted,  it  is
367       treated  as  being the beginning of time.  If the end is omitted, it is
368       treated as the current time.
369
370       There are 4 basic formats:
371
372       d:start-end
373              Specify both start and end explicitly
374
375       d:start-
376              Specify start, end is the current time
377
378       d:-end Specify end, start is 'a long time ago' (i.e.  early  enough  to
379              include any message).
380
381       d:period
382              Specify  start  and  end implicitly, as the start and end of the
383              period given.
384
385
386       The start and end can be specified either absolute or relative.  A rel‐
387       ative  endpoint is given as a number followed by a single letter defin‐
388       ing the scaling:
389
390
391       ┌────────┬─────────────┬───────────┬───────────────────────┐
392letter  short for  example  meaning              
393       ├────────┼─────────────┼───────────┼───────────────────────┤
394       │d       │  days       │  3d       │  3 days               │
395       │w       │  weeks      │  2w       │  2 weeks (14 days)    │
396       │m       │  months     │  5m       │  5 months (150 days)  │
397       │y       │  years      │  4y       │  4 years (4*365 days) │
398       └────────┴─────────────┴───────────┴───────────────────────┘
399
400       Months are always treated as 30 days, and years as 365 days,  for  this
401       purpose.
402
403       Absolute times can be specified in many forms.  Some forms have differ‐
404       ent meanings when they define a start date from that when  they  define
405       an  end  date.   Where a single expression specifies both the start and
406       end (i.e. where the argument to d: doesn't contain a `-'), it will usu‐
407       ally have different interpretations in the two cases.
408
409       In  the  examples  below,  suppose the current date is Sunday May 18th,
410       2003 (when I started to write this material.)
411
412
413       ┌─────────────────────┬──────────────────────┬───────────────────────┬─────────────────────────────────┐
414       │Example              │  Start date          │  End date             │  Notes                          │
415       ├─────────────────────┼──────────────────────┼───────────────────────┼─────────────────────────────────┤
416       │d:20030301-20030425  │  March 1st, 2003     │  25th April, 2003     │                                 │
417       │d:030301-030425      │  March 1st, 2003     │  April 25th, 2003     │  century assumed                │
418       │d:mar1-apr25         │  March 1st, 2003     │  April 25th, 2003     │                                 │
419       │d:Mar1-Apr25         │  March 1st, 2003     │  April 25th, 2003     │  case insensitive               │
420       │d:MAR1-APR25         │  March 1st, 2003     │  April 25th, 2003     │  case insensitive               │
421       │d:1mar-25apr         │  March 1st, 2003     │  April 25th, 2003     │  date and month in either order │
422       │d:2002               │  January 1st, 2002   │  December 31st, 2002  │  whole year                     │
423       │d:mar                │  March 1st, 2003     │  March 31st, 2003     │  most recent March              │
424       │d:oct                │  October 1st, 2002   │  October 31st, 2002   │  most recent October            │
425       │d:21oct-mar          │  October 21st, 2002  │  March 31st, 2003     │  start before end               │
426       │d:21apr-mar          │  April 21st, 2002    │  March 31st, 2003     │  start before end               │
427       │d:21apr-             │  April 21st, 2003    │  May 18th, 2003       │  end omitted                    │
428       │d:-21apr             │  January 1st, 1900   │  April 21st, 2003     │  start omitted                  │
429       │d:6w-2w              │  April 6th, 2003     │  May 4th, 2003        │  both dates relative            │
430       │d:21apr-1w           │  April 21st, 2003    │  May 11th, 2003       │  one date relative              │
431       │d:21apr-2y           │  April 21st, 2001    │  May 11th, 2001       │  start before end               │
432       │d:99-11              │  January 1st, 1999   │  May 11th, 2003       │ 2 digits are a day of the month │
433       │                     │                      │                       │ if possible, otherwise a year   │
434       │d:99oct-1oct         │  October 1st, 1999   │  October 1st, 2002    │ end before now, single digit is │
435       │                     │                      │                       │ a day of the month              │
436       │d:99oct-01oct        │  October 1st, 1999   │  October 31st, 2001   │ 2  digits  starting  with  zero │
437       │                     │                      │                       │ treated as a year               │
438       │d:oct99-oct1         │  October 1st, 1999   │  October 1st, 2002    │ day and month in either order   │
439       │d:oct99-oct01        │  October 1st, 1999   │  October 31st, 2001   │ year and month in either order  │
440       └─────────────────────┴──────────────────────┴───────────────────────┴─────────────────────────────────┘
441
442       The principles in the table work as follows.
443
444       ·      When the expression defines a period of more than a day (i.e. if
445              a month or year is specified), the earliest day in the period is
446              taken  when  the  start date is defined, and the last day in the
447              period if the end of the range is being defined.
448
449       ·      The end date is always taken to be  on  or  before  the  current
450              date.
451
452       ·      The start date is always taken to be on or before the end date.
453
454

SETTING UP THE MATCH FOLDER

456       If  the  match folder does not exist when running in search mode, it is
457       automatically  created.   For  'mformat=maildir'  (the  default),  this
458       should be all you need to do.  If you use 'mformat=mh', you may have to
459       run some commands before your mailer will recognize the  folder.   e.g.
460       for mutt, you could do
461
462              mkdir -p /home/richard/Mail/mfolder
463              touch /home/richard/Mail/mfolder/.mh_sequences
464
465       which  seems  to  work.   Alternatively,  within  mutt,  you  could set
466       MBOX_TYPE to 'mh' and save a message to '+mfolder' to have mutt set  up
467       the structure for you in advance.
468
469       If  you use Sylpheed, the best way seems to be to create the new folder
470       from within Sylpheed before letting mairix write into it.
471
472

EXAMPLES

474       Suppose my email address is <richard@doesnt.exist>.
475
476       Either of the following will match all messages  newer  than  3  months
477       from me with the word 'chrony' in the subject line:
478
479              mairix d:3m- f:richard+doesnt+exist s:chrony
480              mairix d:3m- f:richard@doesnt.exist s:chrony
481
482       Suppose  I  don't  mind a few spurious matches on the address, I want a
483       wider date range, and I suspect that some messages I replied  to  might
484       have  had  the  subject  keyword  spelt  wrongly  (let's  allow up to 2
485       errors):
486
487              mairix d:6m- f:richard s:chrony=2
488

NOTES

490       mairix works exclusively in terms of words.  The index that's built  in
491       indexing  mode contains a table of which words occur in which messages.
492       Hence, the search capability is based on finding messages that  contain
493       particular  words.  mairix defines a word as any string of alphanumeric
494       characters + underscore.  Any whitespace, punctuation, hyphens etc  are
495       treated as word boundaries.
496
497       mairix  has  special  handling  for  the  To:,  Cc:  and From: headers.
498       Besides the normal word scan, these headers are scanned a second  time,
499       where  the characters '@', '-' and '.' are also treated as word charac‐
500       ters.  This allows most (if not all) email addresses to appear  in  the
501       database  as  single  words.   So  if  you have a mail from wibble@foo‐
502       bar.zzz, it will match on both these searches
503
504
505              mairix f:foobar
506              mairix f:wibble@foobar.zzz
507
508       It should be clear by now that the searching cannot  be  used  to  find
509       messages  matching  general  regular  expressions.  This has never been
510       much of a limitation.  Most searches are for particular  keywords  that
511       were  in the messages, or details of the recipients, or the approximate
512       date.
513
514       It's also worth pointing out that there is  no  'locality'  information
515       stored, so you can't search for messages that have one words 'close' to
516       some other word.  For every message and every word, there is  a  simple
517       yes/no  condition  stored  - whether the message contains the word in a
518       particular header or in the body.  So far this has proved  to  be  ade‐
519       quate.  mairix has a similar feel to using an Internet search engine.
520
521

FILES

523       ~/.mairixrc
524
525

AUTHOR

527       Copyright (C) 2002-2006 Richard P. Curnow <rc@rc0.org.uk>
528

SEE ALSO

530       mairixrc(5)
531

BUGS

533       We need a plugin scheme to allow more types of attachment to be scanned
534       and indexed.
535
536
537
538
539                                 January 2006                        MAIRIX(1)
Impressum