1SQUATTER(8)                       Cyrus IMAP                       SQUATTER(8)
2
3
4

NAME

6       squatter - Cyrus IMAP documentation
7
8       Create SQUAT and Xapian indexes for mailboxes
9

SYNOPSIS

11          general:
12          squatter [ -C config-file ] [mode] [options] [source]
13
14          i.e.:
15          squatter [ -C config-file ] [ -v ] [ -a ] [ -S seconds ] [ -Z ]
16          squatter [ -C config-file ] [ -v ] [ -a ] [ -i ] [ -N name ] [ -S seconds ] [ -r ] [ -Z ] mailbox...
17          squatter [ -C config-file ] [ -v ] [ -a ] [ -i ] [ -N name ] [ -S seconds ] [ -r ] [ -Z ] -u user...
18          squatter [ -C config-file ] [ -v ] [ -a ] -R [ -n channel ] [ -d ] [ -S seconds ] [ -Z ]
19          squatter [ -C config-file ] [ -v ] [ -a ] -f synclogfile [ -S seconds ] [ -Z ]
20          squatter [ -C config-file ] [ -v ] -t srctier(s)... -z desttier [ -B ] [ -F ] [ -U ] [ -T reindextiers ] [ -X ] [ -o ] [ -S seconds ] [ -u user... ]
21

DESCRIPTION

23       NOTE:
24          The  name "squatter" once referred both to the SQUAT indexing engine
25          and to the command used to create indexes.  Now that Cyrus  supports
26          more  than one index type -- SQUAT and Xapian, as of this writing --
27          the name "squatter" refers to the command used to control index cre‐
28          ation.  The terms "SQUAT" or "SQUAT index(es)" refers to the indexes
29          used by the older SQUAT indexing engine.  Post v3 the  search_engine
30          setting in imapd.conf determines which search engine is used.
31
32       squatter  creates a new text index for one or more IMAP mailboxes.  The
33       index is a unified index of all of the header and  body  text  of  each
34       message in a given mailbox.  This index is used to significantly reduce
35       IMAP SEARCH times on a mailbox.
36
37       mode is one of indexer, search, rolling, synclog, compact or audit.
38
39       By default, squatter creates an index of ALL messages in  the  mailbox,
40       not  just  those since the last time that it was run.  The -i option is
41       used to select incremental updates.  Any messages appended to the mail‐
42       box  after  squatter is run, will NOT be included in the index.  To in‐
43       clude new messages in the index, squatter must be run again,  or  on  a
44       regular   basis  via  crontab,  an  entry  in  the  EVENTS  section  of
45       cyrus.conf(5) or use rolling mode (-R).
46
47       In the first synopsis, squatter indexes all mailboxes.
48
49       In the second synopsis, squatter  indexes  the  specified  mailbox(es).
50       The mailboxes are space-separated.
51
52       In  the  third  synopsis,  squatter indexes the specified user(s) mail‐
53       box(es).
54
55       For the latter two index modes (mailbox, user) one may optionally spec‐
56       ify  -r to recurse from the specified start, or -a to limit action only
57       to mailboxes which have the shared /vendor/cmu/cyrus-imapd/squat  anno‐
58       tation set to "true".
59
60       In  the  fourth  synopsis, squatter runs in rolling mode.  In this mode
61       squatter backgrounds itself and runs as a daemon (unless  -d  is  set),
62       listening  to a sync log channel chosen using the -n option, and set up
63       using the sync_log_channels setting in imapd.conf(5).  Very soon  after
64       messages are delivered or uploaded to mailboxes squatter will incremen‐
65       tally index the affected mailbox (see notes, below).
66
67       In the fifth synopsis, squatter reads a single sync log file  and  per‐
68       forms  incremental indexing on the mailbox(es) listed therein.  This is
69       sometimes useful for cleaning up after problems with rolling mode.
70
71       In the sixth synopsis, squatter will compact indices from srctier(s) to
72       desttier, optionally reindexing (-X) or filtering expunged records (-F)
73       in the process.  The optional -T flag may be used to specify members of
74       srctiers  which  must  be reindexed.  These files are eventually copied
75       with rsync -a and then removed by rm.  rsync can increase the load  av‐
76       erage  of  the  system,  especially  when the temporary directory is on
77       tmpfs.  To throttle  rsync  it  is  possible  to  modify  the  call  in
78       imap/search_xapian.c and pass -\-bwlimit=<number> as further parameter.
79       The -o flag may be used to direct that a single index be copied, rather
80       than  compacted,  from srctier to desttier.  The -u flag may be used to
81       restrict operation to the specified user(s).
82
83       For all modes, the -S option may  be  specified,  causing  squatter  to
84       pause seconds seconds after each mailbox, to smooth loads.
85
86       When  using  the  Xapian engine the -Z option may be specified, for the
87       indexing modes.  This tells squatter to consult the  Xapian  internally
88       indexed  GUIDs,  rather  than  relying  on  what's  stored in cyrus.in‐
89       dexed.db, allowing for recovery from  broken  cyrus.indexed.db  at  the
90       sacrifice of efficiency.
91
92       NOTE:
93          Incremental  updates  are very inefficient with the SQUAT search en‐
94          gine.  If using SQUAT for large and active mailboxes, you should run
95          squatter periodically as an EVENT in cyrus.conf(5).
96
97       NOTE:
98          Messages  and  mailboxes  that  have  not  been indexed CAN still be
99          SEARCHed, just not as quickly as those with an index.
100
101       squatter reads its configuration options out of the imapd.conf(5)  file
102       unless specified otherwise by -C.
103

OPTIONS

105       -C config-file
106              Use the specified configuration file config-file rather than the
107              default imapd.conf(5).
108
109       -a, --squat-annot
110              Only create indexes for mailboxes which have  the  shared  /ven‐
111              dor/cmu/cyrus-imapd/squat annotation set to "true".
112
113              The value of the /vendor/cmu/cyrus-imapd/squat annotation is in‐
114              herited by all children of the given mailbox, so an entire mail‐
115              box tree can be indexed (or not indexed) by setting a single an‐
116              notation on the root of that tree with a  value  of  "true"  (or
117              "false").     If    a    mailbox   does   not   have   a   /ven‐
118              dor/cmu/cyrus-imapd/squat annotation set on it (or does not  in‐
119              herit one), then the mailbox is not indexed. In other words, the
120              implicit value of /vendor/cmu/cyrus-imapd/squat is "false".
121
122       -A, --audit
123              Audits the specified mailboxes (or all), reports  any  unindexed
124              messages.  This feature is only available on the master branch.
125
126       -d, --nodaemon
127              In  rolling  mode,  don't background and do emit log messages on
128              standard error.  Useful for debugging.  This feature was  intro‐
129              duced in version 3.0.
130
131       -B, --skip-locked
132              In  compact  mode,  use  non-blocking lock to start and skip any
133              users who have their xapianactive file locked at the  time  (i.e
134              another reindex task) This feature is only available on the mas‐
135              ter branch.
136
137       -F, --filter
138              In compact mode, filter the resulting database to  only  include
139              messages  which  are  not  expunged  in  mailboxes with existing
140              name/uidvalidity.  This feature was introduced in version 3.0.
141
142       -f synclogfile, --synclog=synclogfile
143              Read the synclogfile and incrementally index all  the  mailboxes
144              listed  therein, then exit.  This feature was introduced in ver‐
145              sion 3.0.
146
147       -h, --help
148              Display this usage information.
149
150       -i, --incremental
151              Incremental updates where indexes already exist.
152
153       -N name, --name=name
154              Only index mailboxes beginning with name while iterating through
155              the mailbox list derived from other options.
156
157       -n channel, --channel=channel
158              In  rolling  mode, specify the name of the sync log channel that
159              squatter will listen to.  The default is "squatter".  This chan‐
160              nel  must  be  defined in imapd.conf(5) before being used.  This
161              feature was introduced in version 3.0.
162
163       -o, --copydb
164              In compact mode, if only one source database is  selected,  just
165              copy it to the destination rather than compacting.  This feature
166              was introduced in version 3.0.
167
168       -p, --allow-partials
169                 When indexing, allow messages to be partially  indexed.  This
170                 may  occur  if  attachment  indexing  is enabled but indexing
171                 failed for one or more attachment body parts. If this flag is
172                 set, the message is partially indexed and squatter continues.
173                 Otherwise squatter aborts with an error. Also see -P.  Xapian
174                 only.  This feature is only available on the master branch.
175
176              -P, --reindex-partials
177                     When  reindexing,  then  attempt to reindex any partially
178                     indexed messages (see -p). Setting this flag implies  -Z.
179                     Xapian  only.  This feature is only available on the mas‐
180                     ter branch.
181
182              -L, --reindex-minlevel=level
183                     When reindexing, index all messages that  have  an  index
184                     level less than level. Currently, Cyrus only supports two
185                     index levels: A message for which attachment indexing was
186                     never attempted has index level 1. A message that has in‐
187                     dexed attachments, or does not contain  attachments,  has
188                     index  level  3. Consequently, running squatter with min‐
189                     level set to 3 will cause it to  attempt  reindexing  all
190                     messages,  for  which  attachment  indexing never was at‐
191                     tempted.  Future Cyrus versions may introduce  additional
192                     levels. Setting this flag implies -Z.  Xapian only.  This
193                     feature is only available on the master branch.
194
195       -R, --rolling
196              Run in rolling mode; squatter runs as a daemon  listening  to  a
197              sync  log  channel and continuously incrementally indexing mail‐
198              boxes.  See also -d and -n.  This feature was introduced in ver‐
199              sion 3.0.
200
201       -r, --recursive
202              Recursively  create  indexes  for all sub-mailboxes of the user,
203              mailboxes or mailbox prefixes given as arguments.
204
205       -s delta, --squat-skip=delta
206              Skip mailboxes that have not been  modified  since  last  index.
207              This  is  achieved  by comparing the last modification time of a
208              mailbox to the last time the squat index of this mailbox got up‐
209              dated.  If the mailbox modification time plus delta is less than
210              the squat index modification time, then the mailbox is  skipped.
211              The  argument  value  delta  is  defined  in seconds and must be
212              greater than or equal to zero. The historical default delta  was
213              60,  and  this  remains a good general choice, but for technical
214              reasons it must now be specified explicitly.  Squat only.
215
216       -S seconds, --sleep=seconds
217              After processing each mailbox, sleep for "seconds"  before  con‐
218              tinuing.  Can  be  used to provide some load balancing.  Accepts
219              fractional amounts. This feature was introduced in version 3.0.
220
221       -T reindextiers, --reindex-tier=reindextiers
222              In compact mode, a comma-separated subset of  the  source  tiers
223              (see -t) to be reindexed.  Similar to -X but allows limiting the
224              tiers that will be reindexed.  This feature  was  introduced  in
225              version 3.0.
226
227       -t srctiers, --srctier=srctiers
228              In compact mode, the comma-separated source tier(s) for the com‐
229              pacted indices.  At least one source tier must be  specified  in
230              compact mode.  Xapian only.  This feature was introduced in ver‐
231              sion 3.0.
232
233       -u name, --user=name
234              Extra options refer to usernames (e.g. foo@bar.com) rather  than
235              mailbox names.  Usernames are space-separated.  This feature was
236              introduced in version 3.0.
237
238       -U, --only-upgrade
239              In compact mode, only  compact  if  re-indexing.   Xapian  only.
240              This feature is only available on the master branch.
241
242       -v, --verbose
243              Increase  the  verbosity of progress/status messages.  Sometimes
244              additional messages are emitted on the terminal with this option
245              and  the messages are unconditionally sent to syslog.  Sometimes
246              messages are sent to syslog, only if -v is provided.  In rolling
247              and synclog modes, -vv sends even more messages to syslog.
248
249       -X, --reindex
250              Reindex all the messages before compacting.  This mode reads all
251              the lists of messages indexed by the listed  tiers,  and  re-in‐
252              dexes them into a temporary database before compacting that into
253              place.  Xapian only.  This feature  was  introduced  in  version
254              3.0.
255
256       -z desttier, --compact=desttier
257              In compact mode, the destination tier for the compacted indices.
258              This must be specified in compact mode.  Xapian only.  This fea‐
259              ture was introduced in version 3.0.
260
261       -Z, --internalindex
262              When  indexing  messages, use the Xapian internal cyrusid rather
263              than referencing the ranges of already indexed messages to  know
264              if  a  particular  message is indexed.  Useful if the ranges get
265              out of sync with the actual messages (e.g. if files  on  a  tier
266              are  lost)  Xapian  only.  This feature is only available on the
267              master branch.
268

EXAMPLES

270       squatter is typically deployed via entries in cyrus.conf(5), in  either
271       the DAEMON or EVENTS sections.
272
273       For  the  older  SQUAT  search engine, which offers poor performance in
274       rolling mode (-R) we recommend triggering periodic runs via entries  in
275       the EVENTS section, as follows:
276
277       Sample  entries  from  the EVENTS section of cyrus.conf(5) for periodic
278       squatter runs:
279
280              EVENTS {
281                  # reindex changed mailboxes (fulltext) approximately every three hours
282                  squatter1   cmd="/usr/bin/ionice -c idle /usr/lib/cyrus/bin/squatter -i" period=180
283
284                  # reindex all mailboxes (fulltext) daily
285                  squattera   cmd="/usr/lib/cyrus/bin/squatter" at=0117
286              }
287
288       For the newer Xapian search engine, and with sufficiently fast storage,
289       the  rolling mode (-R) offers advantages.  Use of rolling mode requires
290       that squatter be invoked in the DAEMON section.
291
292       Sample entries for the DAEMON  section  of  cyrus.conf(5)  for  rolling
293       squatter operation:
294
295              DAEMON {
296                # run a rolling squatter using the default sync_log channel "squatter"
297                squatter cmd="squatter -R"
298
299                # run a rolling squatter using a specific sync_log channel
300                squatter cmd="squatter -R -n indexer"
301              }
302
303       NOTE:
304          When  using  the -R rolling mode, you MUST enable sync_log operation
305          in imapd.conf(5) via the sync_log: on setting,  and  MUST  define  a
306          sync_log  channel via the sync_log_channels: setting.  If also using
307          replication, you must either  explicitly  specify  your  replication
308          sync_log channel via the sync_log_channels directive with a name, or
309          specify the default empty name with  ""  (the  two-character  string
310          U+22 U+22).  [Please see imapd.conf(5) for details].
311
312       NOTE:
313          When configuring rolling search indexing on a replica, one must con‐
314          sider whether sync_logs will be  written  at  all.   In  this  case,
315          please  consider the setting sync_log_unsuppressable_channels to en‐
316          sure that the sync_log channel upon which  one's  squatter  instance
317          depends will continue to be written.  See imapd.conf(5) for details.
318
319       NOTE:
320          When  using  the  Xapian search engine, you must define various set‐
321          tings in imapd.conf(5).  Please read all relevant Xapian  documenta‐
322          tion in this release before using Xapian.
323
324       [NB: More examples needed]
325

HISTORY

327       Support for additional search engines was added in version 3.0.
328
329       The following command-line switches were added in version 3.0:
330
331              -F -R -X -d -f -o -u
332
333       The following command-line settings were added in version 3.0:
334
335              -S <seconds>, -T <directory>, -f <synclogfile>, -n <channel>, -t srctier..., -z desttier
336

FILES

338       /etc/imapd.conf, /etc/cyrus.conf
339

SEE ALSO

341       imapd.conf(5), cyrus.conf(5)
342

AUTHOR

344       The Cyrus Team, Nic Bernstein (Onlight)
345
347       1993–2023, The Cyrus Team
348
349
350
351
3523.8.1                            Sep 11, 2023                      SQUATTER(8)
Impressum