1RECOLLINDEX(1)              General Commands Manual             RECOLLINDEX(1)
2
3
4

NAME

6       recollindex - indexing command for the Recoll full text search system
7

SYNOPSIS

9       recollindex -h
10       recollindex [ -c <cfdir>] [ -z|-Z ] [ -k ] [ --diagsfile <diagpath> ]
11       recollindex [ -c <cfd>] -m [ -w <secs>] [ -D ] [ -x ] [ -C ] [ -n|-k ]
12       recollindex [ -c <cfdir>] -i [ -Z -k -f -P ] [<path [path ...]>]
13       recollindex [ -c <cfdir>] -r [ -Z -K -e -f ] [ -p pattern ] <dirpath>
14       recollindex [ -c <cfdir>] -e [<path [path ...]>]
15       recollindex [ -c <cfdir>] -l|-S|-E
16       recollindex [ -c <cfdir>] -s <lang>
17       recollindex [ -c <cfdir>] --webcache-compact
18       recollindex  [  -c <cfdir>] --webcache-burst <destdir> recollindex [ -c
19       <cfdir>] --notindexed [path [path ...]]
20
21

DESCRIPTION

23       The recollindex command is the Recoll indexer.
24
25       As indexing can sometimes take a long time, the command can  be  inter‐
26       rupted  by sending an interrupt (Ctrl-C, SIGINT) or terminate (SIGTERM)
27       signal. Some time may elapse before the process exits, because it needs
28       to  properly  flush and close the index. This can also be done from the
29       recoll GUI (menu entry: File/Stop_Indexing). After  such  an  interrup‐
30       tion,  the  index will be somewhat inconsistent because some operations
31       which are normally performed at the end of the indexing pass will  have
32       been  skipped (for example, the stemming and spelling databases will be
33       inexistent or out of date). You just need  to  restart  indexing  at  a
34       later time to restore consistency. The indexing will restart at the in‐
35       terruption point (the full file tree will be traversed, but files  that
36       were indexed up to the interruption and for which the index is still up
37       to date will not need to be reindexed).
38
39       The -c option specifies the configuration  directory  name,  overriding
40       the default or $RECOLL_CONFDIR.
41
42       There are several modes of operation.
43
44       The normal mode will index the set of files described in the configura‐
45       tion file recoll.conf.  This will  incrementally  update  the  database
46       with  files that changed since the last run. If option -z is given, the
47       database will be erased before starting. If option  -Z  is  given,  the
48       database will not be reset, but all files will be considered as needing
49       reindexing (in place reset).
50
51       As of version 1.21, recollindex usually does not  process  again  files
52       which  previously  failed  to  index  (for example because of a missing
53       helper program). If option -k is given, recollindex will try  again  to
54       process  all failed files. Please note that recollindex may also decide
55       to retry failed files if the auxiliary checking script defined  by  the
56       "checkneedretryindexscript"  configuration variable indicates that this
57       should happen.
58
59       If option --diagsfile is given, the path given  as  parameter  will  be
60       truncated  and indexing diagnostics will be written to it. Each line in
61       the file will have a diagnostic type (reason for the file not to be in‐
62       dexed),  the file path, and a possible additional piece of information,
63       which can be the MIME type or the archive internal  path  depending  on
64       the issue. The following diagnostic types are currently defined:
65
66              Skipped  :  the  path  matches  an  element  of  skippedPaths or
67              skippedNames.
68
69              NoContentSuffix : the file name suffix is found  in  the  noCon‐
70              tentSuffixes list.
71
72              MissingHelper : a helper program is missing.
73
74              Error : general error (see the log).
75
76              NoHandler: no handler is defined for the MIME type.
77
78              ExcludedMime  :  the  MIME type is part of the excludedmimetypes
79              list.
80
81              NotIncludedMime : the onlymimetypes list is not  empty  and  the
82              the MIME type is not in it.
83
84       If option -m is given, recollindex is started for real time monitoring,
85       using the file system monitoring package it was configured for  (either
86       fam, gamin, or inotify). This mode must have been explicitly configured
87       when building the package, it is not available by default. The  program
88       will normally detach from the controlling terminal and become a daemon.
89       If option -D is given, it will stay in the foreground. Option -w  <sec‐
90       onds>  can  be  used  to  specify that the program should sleep for the
91       specified time before indexing begins. The default  value  is  60.  The
92       daemon  normally  monitors  the X11 session and exits when it is reset.
93       Option -x disables this X11 session monitoring (daemon will stay  alive
94       even  if it cannot connect to the X11 server). You need to use this too
95       if you use the daemon without an X11 context. You can use option -n  to
96       skip  the  initial incrementing pass which is normally performed before
97       monitoring starts. Once monitoring is started, the daemon normally mon‐
98       itors  the configuration and restarts from scratch if a change is made.
99       You can disable this with option -C
100
101       recollindex -i will index individual files into the database. The  stem
102       expansion  and  aspell  databases will not be updated. The skippedPaths
103       and skippedNames configuration variables will be  used,  so  that  some
104       files  may  be skipped. You can tell recollindex to ignore skippedPaths
105       and skippedNames by setting the -f option.  This  allows  fully  custom
106       file selection for a given subtree, for which you would add the top di‐
107       rectory to skippedPaths, and use any custom tool to generate  the  file
108       list (ie: a tool from a source code control system). When run this way,
109       the indexer normally does not perform the deleted files purge pass, be‐
110       cause  it  cannot  be sure to have seen all the existing files. You can
111       force a purge pass with -P.
112
113       recollindex -e will erase data for individual files from the  database.
114       The stem expansion databases will not be updated.
115
116       Options  -i  and -e can be combined. This will first perform the purge,
117       then the indexing.
118
119       With options -i or -e , if no file names are given on the command line,
120       they will be read from stdin, so that you could for example run:
121
122       find /path/to/dir -print | recollindex -e -i
123
124       to  force the reindexing of a directory tree (which has to exist inside
125       the file system area defined by  topdirs  in  recoll.conf).  You  could
126       mostly accomplish the same thing with
127
128       find /path/to/dir -print | recollindex -Z -i
129
130       The  latter will perform a less thorough job of purging stale sub-docu‐
131       ments though.
132
133       recollindex -r mostly works like -i , but the parameter is a single di‐
134       rectory,  which  will  be recursively updated. This mostly does nothing
135       more than find topdir | recollindex -i but it may be more convenient to
136       use when started from another program. This retries failed files by de‐
137       fault, use option -K to change. One or multiple -p options can be  used
138       to set shell-type selection patterns (e.g.: *.pdf).
139
140       recollindex -l will list the names of available language stemmers.
141
142       recollindex  -s will build the stem expansion database for a given lan‐
143       guage, which may or may not be part of the list  in  the  configuration
144       file. If the language is not part of the configuration, the stem expan‐
145       sion database will be deleted at the end of the  next  normal  indexing
146       run. You can get the list of stemmer names from the recollindex -l com‐
147       mand. Note that this is mostly for experimental use, the normal way  to
148       add  a  stemming  language is to set it in the configuration, either by
149       editing "recoll.conf" or by using the GUI indexing  configuration  dia‐
150       log.
151       At  the  time  of  this writing, the following languages are recognized
152       (out of Xapian's stem.h):
153
154       •      danish
155
156       •      dutch
157
158       •      english Martin Porter's 2002 revision of his stemmer
159
160       •      english_lovins Lovin's stemmer
161
162       •      english_porter Porter's stemmer as described in his 1980 paper
163
164       •      finnish
165
166       •      french
167
168       •      german
169
170       •      italian
171
172       •      norwegian
173
174       •      portuguese
175
176       •      russian
177
178       •      spanish
179
180       •      swedish
181
182       recollindex -S will rebuild the phonetic/orthographic index. This  fea‐
183       ture uses the aspell package, which must be installed on the system.
184
185       recollindex  -E will check the configuration file for topdirs and other
186       relevant paths existence (to help catch typos).
187
188       recollindex --webcache-compact will recover the space wasted by  erased
189       page  instances  inside  the  Web cache. It may temporarily need to use
190       twice the disk space used by the Web cache.
191
192       recollindex --webcache-burst <destdir> will extract  all  entries  from
193       the  Web  cache  to files created inside <destdir>. Each cache entry is
194       extracted as two files, for the data and metadata.
195
196       recollindex --notindexed [path [path ...]]  will check  each  path  and
197       print  out those which are absent from the index (with an "ABSENT" pre‐
198       fix), or caused an indexing error (with an "ERROR" prefix). If no paths
199       are  given  on  the  command  line, the command will read them, one per
200       line, from stdin.
201
202

SEE ALSO

204       recoll(1) recoll.conf(5)
205
206
207
208                                8 January 2006                  RECOLLINDEX(1)
Impressum