1MU-INDEX(1)                 General Commands Manual                MU-INDEX(1)
2
3
4

NAME

6       mu index - index e-mail messages stored in Maildirs
7
8

SYNOPSIS

10       mu index [options]
11
12

DESCRIPTION

14       mu  index is the mu command for scanning the contents of Maildir direc‐
15       tories and storing the results in a Xapian database. The data can  then
16       be queried using mu-find(1).
17
18       Note  that before the first time you run mu index, you must run mu init
19       to initialize the database.
20
21       index understands Maildirs as defined by Daniel Bernstein for qmail(7).
22       In   addition,  it  understands  recursive  Maildirs  (Maildirs  within
23       Maildirs), Maildir++. It can also deal with VFAT-based  Maildirs  which
24       use '!' or ';' as the separators instead of ':'.
25
26       E-mail  messages which are not stored in something resembling a maildir
27       leaf-directory (cur and new) are ignored, as are the cache  directories
28       for notmuch and gnus, and any dot-directory.
29
30       Starting  with  mu 1.5.x, symlinks are followed, and can be spread over
31       multiple filesystems; however note that moving  files  around  is  much
32       faster when multiple filesystems are not involved.
33
34       If there is a file called .noindex in a directory, the contents of that
35       directory and all of its subdirectories will be ignored.  This  can  be
36       useful  to  exclude  certain directories from the indexing process, for
37       example directories with spam-messages.
38
39       If there is a file called .noupdate in a  directory,  the  contents  of
40       that directory and all of its subdirectories will be ignored, unless we
41       do a full rebuild (with mu init). This can be useful to speed up things
42       you  have  some  maildirs  that  never  change. Note that you can still
43       search for these messages, this only affects updating the database.
44
45       There also the --lazy-check which can greatly speed  up  indexing;  see
46       below for details.
47
48       The  first  run of mu index may take a few minutes if you have a lot of
49       mail (tens of thousands of messages).  Fortunately, such  a  full  scan
50       needs  to  be  done  only  once;  after  that  it suffices to index the
51       changes,  which  goes  much  faster.  See  the  'Note  on   performance
52       (i,ii,iii)' below for more information.
53
54       The optional 'phase two' of the indexing-process is the removal of mes‐
55       sages from the database for which there is no  longer  a  corresponding
56       file  in  the  Maildir.  If you do not want this, you can use -n, --no‐
57       cleanup.
58
59       When mu index catches one of the  signals  SIGINT,  SIGHUP  or  SIGTERM
60       (e.g.,  when you press Ctrl-C during the indexing process), it tries to
61       shutdown gracefully; it tries to save and commit data,  and  close  the
62       database etc. If it receives another signal (e.g., when pressing Ctrl-C
63       once more), mu index will terminate immediately.
64
65

OPTIONS

67       Note, some of the general options are described in the  mu(1)  man-page
68       and not here, as they apply to multiple mu commands.
69
70
71       --lazy-check
72              in  lazy-check mode, mu does not consider messages for which the
73              time-stamp (ctime) of the  directory  they  reside  in  has  not
74              changed  since  the  previous  indexing run. This is much faster
75              than the non-lazy check, but won't  update  messages  that  have
76              change  (rather than having been added or removed), since merely
77              editing a message does not update the directory  time-stamp.  Of
78              course,  you can run mu-index occasionally without --lazy-check,
79              to pick up such messages.
80
81
82       --nocleanup
83              disables the database cleanup that mu does by default after  in‐
84              dexing.
85
86
87   A note on performance (i)
88       As a non-scientific benchmark, a simple test on the author's machine (a
89       Thinkpad X61s laptop using Linux 2.6.35 and an ext3 file  system)  with
90       no existing database, and a maildir with 27273 messages:
91
92        $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
93        $ time mu index --quiet
94        66,65s user 6,05s system 27% cpu 4:24,20 total
95       (about 103 messages per second)
96
97       A  second run, which is the more typical use case when there is a data‐
98       base already, goes much faster:
99
100        $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
101        $ time mu index --quiet
102        0,48s user 0,76s system 10% cpu 11,796 total
103       (more than 56818 messages per second)
104
105       Note that each test flushes the caches first; a more  common  use  case
106       might  be to run mu index when new mail has arrived; the cache may stay
107       quite 'warm' in that case:
108
109        $ time mu index --quiet
110        0,33s user 0,40s system 80% cpu 0,905 total
111       which is more than 30000 messages per second.
112
113
114
115   A note on performance (ii)
116       As per June 2012, we did the same non-scientific benchmark,  this  time
117       with  an Intel i5-2500 CPU @ 3.30GHz, an ext4 file system and a maildir
118       with 22589 messages. We start without an existing database.
119
120        $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
121        $ time mu index --quiet
122        27,79s user 2,17s system 48% cpu 1:01,47 total
123       (about 813 messages per second)
124
125       A second run, which is the more typical use case when there is a  data‐
126       base already, goes much faster:
127
128        $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
129        $ time mu index --quiet
130        0,13s user 0,30s system 19% cpu 2,162 total
131       (more than 173000 messages per second)
132
133
134
135   A note on performance (iii)
136       As  per July 2016, we did the same non-scientific benchmark, again with
137       the Intel i5-2500 CPU @ 3.30GHz, an ext4 file system.  This  time,  the
138       maildir contains 72525 messages.
139
140        $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
141        $ time mu index --quiet
142        40,34s user 2,56s system 64% cpu 1:06,17 total
143       (about 1099 messages per second).
144
145       As shown, mu has been getting faster with each release, even with rela‐
146       tively expensive new features such as text-normalization (for  case-in‐
147       sensitve/accent-insensitive  matching).  The  profiles are dominated by
148       operations in the Xapian database now.
149
150

FILES

152       mu stores logs of its operations and queries in <muhome>/mu.log (by de‐
153       fault, this is ~/.cache/mu/mu.log). Upon startup, mu checks the size of
154       this  log  file.  If  it  exceeds  1  MB,   it   will   be   moved   to
155       ~/.cache/mu/mu.log.old, overwriting any existing file of that name, and
156       start with an empty log file. This scheme allows for continued  use  of
157       mu without the need for any manual maintenance of log files.
158
159

ENVIRONMENT

161       mu  index  uses  MAILDIR  to find the user's Maildir if it has not been
162       specified explicitly with --maildir=<maildir>. If MAILDIR is  not  set,
163       mu index will try ~/Maildir.
164
165

RETURN VALUE

167       mu  index  return  0  upon  successful completion, and any other number
168       greater than 0 signals an error.
169
170

BUGS

172       Please report bugs if you find them: https://github.com/djcb/mu/issues
173
174

AUTHOR

176       Dirk-Jan C. Binnema <djcb@djcbsoftware.nl>
177
178

SEE ALSO

180       maildir(5), mu(1), mu-init(1), mu-find(1), mu-cfind(1)
181
182
183
184User Manuals                       May 2020                        MU-INDEX(1)
Impressum