1MU INDEX(1) General Commands Manual MU INDEX(1)
2
3
4
6 mu index -- index e-mail messages stored in Maildirs
7
8
10 mu [common-options] index
11
12
14 mu index is the mu command for scanning the contents of Maildir direc‐
15 tories and storing the results in a Xapian database. The data can then
16 be queried using mu-find(1).
17
18
19 Before the first time you run mu index, you must run mu init to ini‐
20 tialize the database.
21
22
23 index understands Maildirs as defined by Daniel Bernstein for qmail(7).
24 In addition, it understands recursive Maildirs (Maildirs within
25 Maildirs), Maildir++. It also supports VFAT-based Maildirs which use
26 '!' or ';' as the separators instead of ':'.
27
28
29 E-mail messages which are not stored in something resembling a maildir
30 leaf-directory (cur and new) are ignored, as are the cache directories
31 for notmuch and gnus, and any dot-directory.
32
33
34 Starting with mu 1.5.x, symlinks are followed, and can be spread over
35 multiple filesystems; however note that moving files around is much
36 faster when multiple filesystems are not involved.
37
38
39 If there is a file called .noindex in a directory, the contents of that
40 directory and all of its subdirectories will be ignored. This can be
41 useful to exclude certain directories from the indexing process, for
42 example directories with spam-messages.
43
44
45 If there is a file called .noupdate in a directory, the contents of
46 that directory and all of its subdirectories will be ignored, unless we
47 do a full rebuild (with mu init). This can be useful to speed up things
48 you have some maildirs that never change. Note that you can still
49 search for these messages, this only affects updating the database.
50 .noupdate is ignored when you start indexing with an empty database
51 (such as directly after mu init.
52
53
54 There also the --lazy-check which can greatly speed up indexing; see
55 below for details.
56
57
58 The first run of mu index may take a few minutes if you have a lot of
59 mail (tens of thousands of messages). Fortunately, such a full scan
60 needs to be done only once; after that it suffices to index the
61 changes, which goes much faster. See the 'Note on performance
62 (i,ii,iii)' below for more information.
63
64
65 The optional 'phase two' of the indexing-process is the removal of mes‐
66 sages from the database for which there is no longer a corresponding
67 file in the Maildir. If you do not want this, you can use -n, --no‐
68 cleanup.
69
70
71 When mu index catches one of the signals SIGINT, SIGHUP or SIGTERM
72 (e.g., when you press Ctrl-C during the indexing process), it attempts
73 to shutdown gracefully; it tries to save and commit data, and close the
74 database etc. If it receives another signal (e.g., when pressing Ctrl-C
75 once more), mu index will terminate immediately.
76
77
79 --lazy-check
80 in lazy-check mode, mu does not consider messages for which the time-
81 stamp (ctime) of the directory they reside in has not changed since the
82 previous indexing run. This is much faster than the non-lazy check, but
83 won't update messages that have change (rather than having been added
84 or removed), since merely editing a message does not update the direc‐
85 tory time-stamp. Of course, you can run mu-index occasionally without
86 --lazy-check, to pick up such messages.
87
88
89 --nocleanup
90 disable the database cleanup that mu does by default after indexing.
91
92
93 --muhome
94 use a non-default directory to store and read the database, write the
95 logs, etc. By default, mu uses the XDG Base Directory Specification
96 (e.g. on GNU/Linux this defaults to ~/.cache/mu and ~/.config/mu). Ear‐
97 lier versions of mu defaulted to ~/.mu, which now requires
98 --muhome=~/.mu.
99
100
101 The environment variable MUHOME can be used as an alternative to
102 --muhome. The latter has precedence.
103
104
106 -d, --debug
107 makes mu generate extra debug information, useful for debugging the
108 program itself. By default, debug information goes to the log file,
109 ~/.cache/mu/mu.log. It can safely be deleted when mu is not running.
110 When running with --debug option, the log file can grow rather quickly.
111 See the note on logging below.
112
113
114 -q, --quiet
115 causes mu not to output informational messages and progress information
116 to standard output, but only to the log file. Error messages will still
117 be sent to standard error. Note that mu index is much faster with
118 --quiet, so it is recommended you use this option when using mu from
119 scripts etc.
120
121
122 --log-stderr
123 causes mu to not output log messages to standard error, in addition to
124 sending them to the log file.
125
126
127 --nocolor
128 do not use ANSI colors. The environment variable NO_COLOR can be used
129 as an alternative to --nocolor.
130
131
132 -V, --version
133 prints mu version and copyright information.
134
135
136 -h, --help
137 lists the various command line options.
138
139
141 indexing in ancient times (2009?)
142 As a non-scientific benchmark, a simple test on the author's machine (a
143 Thinkpad X61s laptop using Linux 2.6.35 and an ext3 file system) with
144 no existing database, and a maildir with 27273 messages:
145
146 $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
147 $ time mu index --quiet
148 66,65s user 6,05s system 27% cpu 4:24,20 total
149
150
151 (about 103 messages per second)
152
153
154 A second run, which is the more typical use case when there is a data‐
155 base already, goes much faster:
156
157 $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
158 $ time mu index --quiet
159 0,48s user 0,76s system 10% cpu 11,796 total
160
161
162 (more than 56818 messages per second)
163
164
165 Note that each test flushes the caches first; a more common use case
166 might be to run mu index when new mail has arrived; the cache may stay
167 quite 'warm' in that case:
168
169 $ time mu index --quiet
170 0,33s user 0,40s system 80% cpu 0,905 total
171
172
173 which is more than 30000 messages per second.
174
175
176 indexing in 2012
177 As per June 2012, we did the same non-scientific benchmark, this time
178 with an Intel i5-2500 CPU @ 3.30GHz, an ext4 file system and a maildir
179 with 22589 messages. We start without an existing database.
180
181 $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
182 $ time mu index --quiet
183 27,79s user 2,17s system 48% cpu 1:01,47 total
184
185
186 (about 813 messages per second)
187
188
189 A second run, which is the more typical use case when there is a data‐
190 base already, goes much faster:
191
192 $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
193 $ time mu index --quiet
194 0,13s user 0,30s system 19% cpu 2,162 total
195
196
197 (more than 173000 messages per second)
198
199
200 indexing in 2016
201 As per July 2016, we did the same non-scientific benchmark, again with
202 the Intel i5-2500 CPU @ 3.30GHz, an ext4 file system. This time, the
203 maildir contains 72525 messages.
204
205 $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
206 $ time mu index --quiet
207 40,34s user 2,56s system 64% cpu 1:06,17 total
208
209
210 (about 1099 messages per second).
211
212
213 indexing in 2022
214 A few years later and it is June 2022. There's a lot more happening
215 during indexing, but indexing became multi-threaded and machines are
216 faster; e.g. this is with an AMD Ryzen Threadripper 1950X (16 cores) @
217 3.399GHz.
218
219
220 The instructions are a little different since we have a proper repeat‐
221 able benchmark now. After building,
222
223 $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
224 % THREAD_NUM=4 build/lib/tests/bench-indexer -m perf
225 # random seed: R02Sf5c50e4851ec51adaf301e0e054bd52b
226 1..1
227 # Start of bench tests
228 # Start of indexer tests
229 indexed 5000 messages in 20 maildirs in 3763ms; 752 μs/message; 1328 messages/s (4 thread(s))
230 ok 1 /bench/indexer/4-cores
231 # End of indexer tests
232 # End of bench tests
233
234
235
236 Things are again a little faster, even though the index does a lot more
237 now (text-normalizatian, and pre-generating message-sexps). A faster
238 machine helps, too!
239
240
242 This command returns 0 upon successful completion, or a non-zero exit
243 code otherwise. Typical values are 2 (no matches found), 11 (database
244 schema mismatch) and 12 (failed to acquire database lock).
245
246
247 no matches found (2)
248 Nothing matching found; try a different query
249
250
251 database schema mismatch (11)
252 You need to re-initialize mu, see mu-init(1)
253
254
255 failed to acquire lock (19)
256 Some other program has exclusive access to the mu (Xapian) database
257
258
260 Please report bugs at https://github.com/djcb/mu/issues.
261
262
264 Dirk-Jan C. Binnema <djcb@djcbsoftware.nl>
265
266
268 This manpage is part of mu 1.10.5.
269
270
271 Copyright © 2022-2023 Dirk-Jan C. Binnema. License GPLv3+: GNU GPL ver‐
272 sion 3 or later https://gnu.org/licenses/gpl.html. This is free soft‐
273 ware: you are free to change and redistribute it. There is NO WARRANTY,
274 to the extent permitted by law.
275
276
278 maildir(5), mu(1), mu-init(1), mu-find(1), mu-cfind(1)
279
280
281
282 MU INDEX(1)