1MAIRIX(1)                   General Commands Manual                  MAIRIX(1)
2
3
4

NAME

6       mairix - index and search mail folders
7

SYNOPSIS

9   Indexing
10       mairix  [  -v|--verbose  ]  [  -p|--purge  ] [ -f|--rcfile mairixrc ] [
11       -F|--fast-index ] [ --force-hash-key-new-database hash ]
12
13
14   Searching
15       mairix [ -v|--verbose ] [ -f|--rcfile mairixrc ] [ -r|--raw-output ]  [
16       -x|--excerpt-output ] [ -H|--force-hardlinks ] [ -o|--mfolder mfolder ]
17       [ -a|--augment ] [ -t|--threads ] search-patterns
18
19
20   Other
21       mairix [ -h|--help ]
22
23       mairix [ -V|--version ]
24
25       mairix [ -d|--dump ]
26
27

DESCRIPTION

29       mairix indexes and searches a collection of email messages.  The  fold‐
30       ers  containing the messages for indexing are defined in the configura‐
31       tion file.  The indexing stage produces a database file.  The  database
32       file  provides  rapid  access to details of the indexed messages during
33       searching operations.  A search normally produces a  folder  (so-called
34       mfolder) containing the matched messages.  However, a raw mode (-r) ex‐
35       ists which just lists the matched messages instead.
36
37       It can operate with the following folder types
38
39       *      maildir
40
41       *      MH (compatible with the MH folder formats used by xmh, sylpheed,
42              claws-mail, nnml (Gnus) and evolution)
43
44       *      mbox  (including  mboxes  that have been compressed with gzip or
45              bzip2)
46
47       *      MMDF (including those compressed with gzip or bzip2)
48
49       *      IMAP: remote folders on an IMAP server
50
51       If maildir or MH source folders are used,  and  a  search  outputs  its
52       matches  to an mfolder in maildir or MH format, symbolic links are used
53       to reference the original messages inside  the  mfolder.   However,  if
54       mbox or MMDF folders are involved, copies of messages are made instead.
55       If IMAP folders are used for  both  source  results,  IMAP  server-side
56       copies  of  messages  are  made. With IMAP source folders and any other
57       type of results folder, messages are downloaded from the IMAP server to
58       be  written  to the results folder. With an IMAP results folder and any
59       other type of source folders, messages are uploaded to the IMAP  server
60       to be appended to the results folder.
61

OPTIONS

63       mairix decides whether indexing or searching is required by looking for
64       the presence of any search-patterns on the command line.
65
66
67   Special modes
68       -h, --help
69              Show usage summary and exit
70
71
72       -V, --version
73              Show program version and exit
74
75
76       -d
77              Dump the database's contents in human-readable form to stdout.
78
79
80   General options
81       -f mairixrc
82       --rcfile mairixrc
83              Specify an alternative configuration file to use.   The  default
84              configuration file is ~/.mairixrc.
85
86
87       -v, --verbose
88              Make the output more verbose
89
90
91       -Q, --no-integrity-checks
92              Normally  mairix  will  do  some internal integrity tests on the
93              database.  The -Q option removes these checks, making mairix run
94              faster,  but  it will be less likely to detect internal problems
95              if any bugs creep in.
96
97              The nochecks directive in the rc file has the same effect.
98
99
100       --unlock
101              mairix locks its database file during any indexing or  searching
102              operation  to  prevent  multiple  indexing runs interfering with
103              each other, or an indexing run  interfering  with  search  runs.
104              The  --unlock  option  removes the lockfile before doing the re‐
105              quested indexing or searching operation.  This is  a  convenient
106              way  of  cleaning  up a stale lockfile if an earlier run crashed
107              for some reason or was aborted.
108
109
110   Indexing options
111       -p, --purge
112              Cause stale (dead) messages to be purged from the database  dur‐
113              ing  an indexing run.  (Normally, stale messages are left in the
114              database because of the additional cost of compacting  away  the
115              storage that they take up.)
116
117
118       -F, --fast-index
119              When processing maildir and MH folders, mairix normally compares
120              the mtime and size of each message against the values stored  in
121              the  database.   If  they have changed, the message will be res‐
122              canned.  This check requires each message file  to  be  stat'ed.
123              For large numbers of messages in these folder types, this can be
124              a sizeable overhead.
125
126              This option tells mairix to assume that when a message currently
127              on-disc  has  a  name  matching  one already in the database, it
128              should assume the message is unchanged.
129
130              A later indexing run without using this option will fix  up  any
131              rescans that were missed due to its use.
132
133
134       --force-hash-key-new-database hash
135              This option should only be used for debugging.
136              If  a new database is created, hash is used as hash key, instead
137              of a random hash.
138
139
140   Search options
141       -a, --augment
142              Append newly matches messages to the current mfolder instead  of
143              creating the mfolder from scratch.
144
145
146       -t, --threads
147              As  well  as  returning  the matched messages, also return every
148              message in the same thread as one of the real matches.
149
150
151       -r, --raw-output
152              Instead of creating an mfolder containing the matched  messages,
153              just show their paths on stdout.
154
155
156       -x, --excerpt-output
157              Instead  of creating an mfolder containing the matched messages,
158              display an excerpt from their headers on  stdout.   The  excerpt
159              shows  To, Cc, From, Subject and Date. With IMAP source folders,
160              this requires downloading each matched  message  from  the  IMAP
161              server.
162
163
164       -H, --force-hardlinks
165              Instead  of creating symbolic links, force the use of hardlinks.
166              This helps mailers such as alpine to realize that there are  new
167              mails in the search folder.
168
169
170       -o mfolder
171       --mfolder mfolder
172              Specify  a  temporary  alternative  path for the mfolder to use,
173              overriding the mfolder directive in the rc file.
174
175              mairix will refuse to output search results into any folder that
176              appears  to  be amongst those that are indexed.  This is to pre‐
177              vent accidental deletion of emails.
178
179
180   Search patterns
181       t:word
182              Match word in the To: header.
183
184
185       c:word
186              Match word in the Cc: header.
187
188
189       f:word
190              Match word in the From: header.
191
192
193       s:word
194              Match word in the Subject: header.
195
196
197       m:word
198              Match word in the Message-ID: header.
199
200
201       b:word
202              Match word in the message body.
203
204              Message body is taken to mean any body part of  type  text/plain
205              or  text/html.  For text/html, text within meta tags is ignored.
206              In particular, the URLs inside <A HREF="..."> tags are not  cur‐
207              rently  indexed.   Non-text attachments are ignored.  If there's
208              an attachment of type message/rfc822, this  is  parsed  and  the
209              match  is  performed  on this sub-message too.  If a hit occurs,
210              the enclosing message is treated as having a hit.
211
212
213       d:[start-datespec]-[end-datespec]
214              Match messages with Date: headers lying in the specific range.
215
216
217       z:[low-size]-[high-size]
218              Match messages whose size lies in the specified range.   If  the
219              low-size  argument is omitted it defaults to zero.  If the high-
220              size argument is omitted it defaults to infinite size.
221
222              For example, to match messages between 10kilobytes  and  20kilo‐
223              bytes in size, the following search term can be used:
224
225                   mairix z:10k-20k
226
227
228
229              The  suffix 'k' on a number means multiply by 1024, and the suf‐
230              fix 'M' on a number means multiply by 1024*1024.
231
232
233       n:word
234              Match word occurring as the name of an attachment  in  the  mes‐
235              sage.   Since  attachment  names  are  usually long, this option
236              would usually be used in the substring form.  So
237
238                   mairix n:mairix=
239
240
241
242              would match all messages which have attachments whose names con‐
243              tain the substring mairix.
244
245              The  attachment  name  is  determined from the name=xxx or file‐
246              name=xxx qualifiers on the  Content-Type:  and  Content-Disposi‐
247              tion: headers respectively.
248
249
250       F:flags
251              Match  messages  with  particular  flag settings.  The available
252              flags are 's' meaning seen, 'r' meaning replied, and 'f' meaning
253              flagged.   The flags are case-insensitive.  A flag letter may be
254              prefixed by a '-' to negate its sense.  Thus
255
256
257                   mairix F:-s d:1w-
258
259
260
261              would match any unread message less than a week old, and
262
263
264                   mairix F:f-r d:-1m
265
266
267
268              would match any flagged message older than  a  month  which  you
269              haven't replied to yet.
270
271              Note  that  the  flag  characters  and their meanings agree with
272              those used as the suffix letters on message filenames in maildir
273              folders.
274
275
276   Searching for a match amongst more than one part of a message
277       Multiple  body parts may be grouped together, if a match in any of them
278       is sought.  Common examples follow.
279
280
281       tc:word
282              Match word in either the To: or Cc: headers (or both).
283
284
285       bs:word
286              Match word in either the Subject: header or the message body (or
287              both).
288
289
290       The  a: search pattern is an abbreviation for tcf:; i.e. match the word
291       in the To:, Cc: or From: headers.  ("a" stands for  "address"  in  this
292       case.)
293
294
295   Match words
296       The word argument to the search strings can take various forms.
297
298
299       ~word
300              Match messages not containing the word.
301
302
303       word1,word2
304              This matches if both the words are matched in the specified mes‐
305              sage part.
306
307
308       word1/word2
309              This matches if either of the words are matched in the specified
310              message part.
311
312
313       substring=
314              Match any word containing substring as a substring
315
316
317       substring=N
318              Match  any word containing substring, allowing up to N errors in
319              the match.  For example, if N is 1, a single error  is  allowed,
320              where an error can be
321
322       *      a missing letter
323
324       *      an extra letter
325
326       *      a different letter.
327
328
329       ^substring=
330              Match any word containing substring as a substring, with the re‐
331              quirement that substring occurs at the beginning of the  matched
332              word.
333
334
335   Precedence matters
336       The binding order of the constructions is:
337
338
339       1.     Individual  command  line  arguments  define separate conditions
340              which are AND-ed together
341
342
343       2.     Within a single argument, the letters before  the  colon  define
344              which  message  parts the expression applies to.  If there is no
345              colon, the expression applies to all the headers listed  earlier
346              and the body.
347
348
349       3.     After the colon, slashes delineate separate disjuncts, which are
350              OR-ed together.
351
352
353       4.     Each disjunct may contain separate conjuncts,  which  are  sepa‐
354              rated by commas.  These conditions are AND-ed together.
355
356
357       5.     Each  conjunct  may  start with a tilde to negate it, and may be
358              followed by a slash to indicate a  substring  match,  optionally
359              followed  by  an  integer to define the maximum number of errors
360              allowed.
361
362
363   Date specification
364       This section describes  the  syntax  used  for  specifying  dates  when
365       searching using the `d:' option.
366
367       Dates  are  specified  as  a range.  The start and end of the range can
368       both be specified.  Alternatively, if  the  start  is  omitted,  it  is
369       treated  as  being the beginning of time.  If the end is omitted, it is
370       treated as the current time.
371
372       There are 4 basic formats:
373
374       d:start-end
375              Specify both start and end explicitly
376
377       d:start-
378              Specify start, end is the current time
379
380       d:-end Specify end, start is 'a long time ago' (i.e.  early  enough  to
381              include any message).
382
383       d:period
384              Specify  start  and  end implicitly, as the start and end of the
385              period given.
386
387
388       The start and end can be specified either absolute or relative.  A rel‐
389       ative  endpoint is given as a number followed by a single letter defin‐
390       ing the scaling:
391
392
393       ┌────────┬─────────────┬───────────┬───────────────────────┐
394letter  short for  example  meaning              
395       ├────────┼─────────────┼───────────┼───────────────────────┤
396       │d       │  days       │  3d       │  3 days               │
397       │w       │  weeks      │  2w       │  2 weeks (14 days)    │
398       │m       │  months     │  5m       │  5 months (150 days)  │
399       │y       │  years      │  4y       │  4 years (4*365 days) │
400       └────────┴─────────────┴───────────┴───────────────────────┘
401
402       Months are always treated as 30 days, and years as 365 days,  for  this
403       purpose.
404
405       Absolute times can be specified in many forms.  Some forms have differ‐
406       ent meanings when they define a start date from that when  they  define
407       an  end  date.   Where a single expression specifies both the start and
408       end (i.e. where the argument to d: doesn't contain a `-'), it will usu‐
409       ally have different interpretations in the two cases.
410
411       In  the  examples  below,  suppose the current date is Sunday May 18th,
412       2003 (when I started to write this material.)
413
414
415       ┌─────────────────────┬──────────────────────┬───────────────────────┬─────────────────────────────────┐
416       │Example              │  Start date          │  End date             │  Notes                          │
417       ├─────────────────────┼──────────────────────┼───────────────────────┼─────────────────────────────────┤
418       │d:20030301-20030425  │  March 1st, 2003     │  25th April, 2003     │                                 │
419       │d:030301-030425      │  March 1st, 2003     │  April 25th, 2003     │  century assumed                │
420       │d:mar1-apr25         │  March 1st, 2003     │  April 25th, 2003     │                                 │
421       │d:Mar1-Apr25         │  March 1st, 2003     │  April 25th, 2003     │  case insensitive               │
422       │d:MAR1-APR25         │  March 1st, 2003     │  April 25th, 2003     │  case insensitive               │
423       │d:1mar-25apr         │  March 1st, 2003     │  April 25th, 2003     │  date and month in either order │
424       │d:2002               │  January 1st, 2002   │  December 31st, 2002  │  whole year                     │
425       │d:mar                │  March 1st, 2003     │  March 31st, 2003     │  most recent March              │
426       │d:oct                │  October 1st, 2002   │  October 31st, 2002   │  most recent October            │
427       │d:21oct-mar          │  October 21st, 2002  │  March 31st, 2003     │  start before end               │
428       │d:21apr-mar          │  April 21st, 2002    │  March 31st, 2003     │  start before end               │
429       │d:21apr-             │  April 21st, 2003    │  May 18th, 2003       │  end omitted                    │
430       │d:-21apr             │  January 1st, 1900   │  April 21st, 2003     │  start omitted                  │
431       │d:6w-2w              │  April 6th, 2003     │  May 4th, 2003        │  both dates relative            │
432       │d:21apr-1w           │  April 21st, 2003    │  May 11th, 2003       │  one date relative              │
433       │d:21apr-2y           │  April 21st, 2001    │  May 11th, 2001       │  start before end               │
434       │d:99-11              │  January 1st, 1999   │  May 11th, 2003       │ 2 digits are a day of the month │
435       │                     │                      │                       │ if possible, otherwise a year   │
436       │d:99oct-1oct         │  October 1st, 1999   │  October 1st, 2002    │ end before now, single digit is │
437       │                     │                      │                       │ a day of the month              │
438       │d:99oct-01oct        │  October 1st, 1999   │  October 31st, 2001   │ 2  digits  starting  with  zero │
439       │                     │                      │                       │ treated as a year               │
440       │d:oct99-oct1         │  October 1st, 1999   │  October 1st, 2002    │ day and month in either order   │
441       │d:oct99-oct01        │  October 1st, 1999   │  October 31st, 2001   │ year and month in either order  │
442       └─────────────────────┴──────────────────────┴───────────────────────┴─────────────────────────────────┘
443
444       The principles in the table work as follows.
445
446       •      When the expression defines a period of more than a day (i.e. if
447              a month or year is specified), the earliest day in the period is
448              taken  when  the  start date is defined, and the last day in the
449              period if the end of the range is being defined.
450
451       •      The end date is always taken to be  on  or  before  the  current
452              date.
453
454       •      The start date is always taken to be on or before the end date.
455
456

SETTING UP THE MATCH FOLDER

458       If  the  match folder does not exist when running in search mode, it is
459       automatically  created.   For  'mformat=maildir'  (the  default),  this
460       should be all you need to do.  If you use 'mformat=mh', you may have to
461       run some commands before your mailer will recognize the  folder.   e.g.
462       for mutt, you could do
463
464              mkdir -p /home/richard/Mail/mfolder
465              touch /home/richard/Mail/mfolder/.mh_sequences
466
467       which  seems  to  work.   Alternatively,  within  mutt,  you  could set
468       MBOX_TYPE to 'mh' and save a message to '+mfolder' to have mutt set  up
469       the structure for you in advance.
470
471       If  you use Sylpheed, the best way seems to be to create the new folder
472       from within Sylpheed before letting mairix write into it.
473
474

EXAMPLES

476       Suppose my email address is <richard@doesnt.exist>.
477
478       Either of the following will match all messages  newer  than  3  months
479       from me with the word 'chrony' in the subject line:
480
481              mairix d:3m- f:richard+doesnt+exist s:chrony
482              mairix d:3m- f:richard@doesnt.exist s:chrony
483
484       Suppose  I  don't  mind a few spurious matches on the address, I want a
485       wider date range, and I suspect that some messages I replied  to  might
486       have  had  the  subject  keyword spelt wrongly (let's allow up to 2 er‐
487       rors):
488
489              mairix d:6m- f:richard s:chrony=2
490

NOTES

492       mairix works exclusively in terms of words.  The index that's built  in
493       indexing  mode contains a table of which words occur in which messages.
494       Hence, the search capability is based on finding messages that  contain
495       particular  words.  mairix defines a word as any string of alphanumeric
496       characters + underscore.  Any whitespace, punctuation, hyphens etc  are
497       treated as word boundaries.
498
499       mairix  has  special  handling for the To:, Cc: and From: headers.  Be‐
500       sides the normal word scan, these headers are scanned  a  second  time,
501       where  the characters '@', '-' and '.' are also treated as word charac‐
502       ters.  This allows most (if not all) email addresses to appear  in  the
503       database  as  single  words.   So  if  you have a mail from wibble@foo‐
504       bar.zzz, it will match on both these searches
505
506
507              mairix f:foobar
508              mairix f:wibble@foobar.zzz
509
510       It should be clear by now that the searching cannot  be  used  to  find
511       messages  matching  general  regular  expressions.  This has never been
512       much of a limitation.  Most searches are for particular  keywords  that
513       were  in the messages, or details of the recipients, or the approximate
514       date.
515
516       It's also worth pointing out that there is  no  'locality'  information
517       stored, so you can't search for messages that have one words 'close' to
518       some other word.  For every message and every word, there is  a  simple
519       yes/no  condition  stored  - whether the message contains the word in a
520       particular header or in the body.  So far this has proved  to  be  ade‐
521       quate.  mairix has a similar feel to using an Internet search engine.
522
523

FILES

525       ~/.mairixrc
526
527

AUTHOR

529       Copyright (C) 2002-2006 Richard P. Curnow <rc@rc0.org.uk>
530

SEE ALSO

532       mairixrc(5)
533

BUGS

535       We need a plugin scheme to allow more types of attachment to be scanned
536       and indexed.
537
538
539
540
541                                 January 2006                        MAIRIX(1)
Impressum