1MU-INDEX(1) General Commands Manual MU-INDEX(1)
2
3
4
6 mu index - index e-mail messages stored in Maildirs
7
8
10 mu index [options]
11
12
14 mu index is the mu command for scanning the contents of Maildir direc‐
15 tories and storing the results in a Xapian database. The data can then
16 be queried using mu-find(1).
17
18 Note that before the first time you run mu index, you must run mu init
19 to initialize the database.
20
21 index understands Maildirs as defined by Daniel Bernstein for qmail(7).
22 In addition, it understands recursive Maildirs (Maildirs within
23 Maildirs), Maildir++. It can also deal with VFAT-based Maildirs which
24 use '!' or ';' as the separators instead of ':'.
25
26 E-mail messages which are not stored in something resembling a maildir
27 leaf-directory (cur and new) are ignored, as are the cache directories
28 for notmuch and gnus, and any dot-directory.
29
30 Starting with mu 1.5.x, symlinks are followed, and can be spread over
31 multiple filesystems; however note that moving files around is much
32 faster when multiple filesystems are not involved.
33
34 If there is a file called .noindex in a directory, the contents of that
35 directory and all of its subdirectories will be ignored. This can be
36 useful to exclude certain directories from the indexing process, for
37 example directories with spam-messages.
38
39 If there is a file called .noupdate in a directory, the contents of
40 that directory and all of its subdirectories will be ignored, unless we
41 do a full rebuild (with mu init). This can be useful to speed up things
42 you have some maildirs that never change. Note that you can still
43 search for these messages, this only affects updating the database.
44
45 There also the --lazy-check which can greatly speed up indexing; see
46 below for details.
47
48 The first run of mu index may take a few minutes if you have a lot of
49 mail (tens of thousands of messages). Fortunately, such a full scan
50 needs to be done only once; after that it suffices to index the
51 changes, which goes much faster. See the 'Note on performance
52 (i,ii,iii)' below for more information.
53
54 The optional 'phase two' of the indexing-process is the removal of mes‐
55 sages from the database for which there is no longer a corresponding
56 file in the Maildir. If you do not want this, you can use -n, --no‐
57 cleanup.
58
59 When mu index catches one of the signals SIGINT, SIGHUP or SIGTERM
60 (e.g., when you press Ctrl-C during the indexing process), it tries to
61 shutdown gracefully; it tries to save and commit data, and close the
62 database etc. If it receives another signal (e.g., when pressing Ctrl-C
63 once more), mu index will terminate immediately.
64
65
67 Note, some of the general options are described in the mu(1) man-page
68 and not here, as they apply to multiple mu commands.
69
70
71 --lazy-check
72 in lazy-check mode, mu does not consider messages for which the
73 time-stamp (ctime) of the directory they reside in has not
74 changed since the previous indexing run. This is much faster
75 than the non-lazy check, but won't update messages that have
76 change (rather than having been added or removed), since merely
77 editing a message does not update the directory time-stamp. Of
78 course, you can run mu-index occasionally without --lazy-check,
79 to pick up such messages.
80
81
82 --nocleanup
83 disables the database cleanup that mu does by default after in‐
84 dexing.
85
86
87 A note on performance (i)
88 As a non-scientific benchmark, a simple test on the author's machine (a
89 Thinkpad X61s laptop using Linux 2.6.35 and an ext3 file system) with
90 no existing database, and a maildir with 27273 messages:
91
92 $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
93 $ time mu index --quiet
94 66,65s user 6,05s system 27% cpu 4:24,20 total
95 (about 103 messages per second)
96
97 A second run, which is the more typical use case when there is a data‐
98 base already, goes much faster:
99
100 $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
101 $ time mu index --quiet
102 0,48s user 0,76s system 10% cpu 11,796 total
103 (more than 56818 messages per second)
104
105 Note that each test flushes the caches first; a more common use case
106 might be to run mu index when new mail has arrived; the cache may stay
107 quite 'warm' in that case:
108
109 $ time mu index --quiet
110 0,33s user 0,40s system 80% cpu 0,905 total
111 which is more than 30000 messages per second.
112
113
114
115 A note on performance (ii)
116 As per June 2012, we did the same non-scientific benchmark, this time
117 with an Intel i5-2500 CPU @ 3.30GHz, an ext4 file system and a maildir
118 with 22589 messages. We start without an existing database.
119
120 $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
121 $ time mu index --quiet
122 27,79s user 2,17s system 48% cpu 1:01,47 total
123 (about 813 messages per second)
124
125 A second run, which is the more typical use case when there is a data‐
126 base already, goes much faster:
127
128 $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
129 $ time mu index --quiet
130 0,13s user 0,30s system 19% cpu 2,162 total
131 (more than 173000 messages per second)
132
133
134
135 A note on performance (iii)
136 As per July 2016, we did the same non-scientific benchmark, again with
137 the Intel i5-2500 CPU @ 3.30GHz, an ext4 file system. This time, the
138 maildir contains 72525 messages.
139
140 $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
141 $ time mu index --quiet
142 40,34s user 2,56s system 64% cpu 1:06,17 total
143 (about 1099 messages per second).
144
145 As shown, mu has been getting faster with each release, even with rela‐
146 tively expensive new features such as text-normalization (for case-in‐
147 sensitve/accent-insensitive matching). The profiles are dominated by
148 operations in the Xapian database now.
149
150
152 mu stores logs of its operations and queries in <muhome>/mu.log (by de‐
153 fault, this is ~/.cache/mu/mu.log). Upon startup, mu checks the size of
154 this log file. If it exceeds 1 MB, it will be moved to
155 ~/.cache/mu/mu.log.old, overwriting any existing file of that name, and
156 start with an empty log file. This scheme allows for continued use of
157 mu without the need for any manual maintenance of log files.
158
159
161 mu index uses MAILDIR to find the user's Maildir if it has not been
162 specified explicitly with --maildir=<maildir>. If MAILDIR is not set,
163 mu index will try ~/Maildir.
164
165
167 mu index return 0 upon successful completion, and any other number
168 greater than 0 signals an error.
169
170
172 Please report bugs if you find them: https://github.com/djcb/mu/issues
173
174
176 Dirk-Jan C. Binnema <djcb@djcbsoftware.nl>
177
178
180 maildir(5), mu(1), mu-init(1), mu-find(1), mu-cfind(1)
181
182
183
184User Manuals May 2020 MU-INDEX(1)