1DOODLE(1) General Commands Manual DOODLE(1)
2
3
4
6 doodle - a tool to search the meta-data in your files
7
8
10 doodle [OPTIONS] ([FILENAMES]*|[KEYWORDS]*)
11
12
14 doodle is a tool to index files. doodle uses libextractor to find
15 meta-data in files. Once a database has been built, doodle can be used
16 to quickly find files of which the meta-data matches a given
17 search-string. This way, doodle can be used to quickly search your
18 file system.
19
20 Generally, the first time you run doodle you pass the option -b to
21 build the database. Together with -b you specify the list of files or
22 directories to index, for example
23
24 $ doodle -b $HOME
25
26
27 Indexing with doodle is incremental. If doodle -b is run (with the
28 same database) twice it will update the index for files that were
29 changed. doodle will also remove files that are no longer accessible.
30 doodle will NOT remove files that are still present but no longer spec‐
31 ified in the argument list. Thus invoking either
32
33 $ doodle -b /foo /bar # or
34
35 $ doodle -b /foo ; doodle -b /bar
36
37
38 will result in the same database containing both the index for /foo and
39 /bar. Note that the only way to only un-index /foo at this point is to
40 make /foo inaccessible (using for example chmod 000 /foo or even rm -rf
41 /foo) and then run doodle -b again.
42
43
44 In networked environments, it often makes sense to build a database at
45 the root of each file system, containing the entries for that file
46 system. For this, doodle is run for each file system on the file
47 server where that file system is on a local disk, to prevent thrashing
48 the network. Users can select which databases doodle searches. Data‐
49 bases cannot be concatenated together.
50
51
52 Once the files have been indexed, you can quickly query the doodle
53 database. Just run
54
55 $ doodle keyword
56
57
58 to search all of your files for keyword. Note that only the meta-data
59 extracted by libextractor is searched. Thus if libextractor does not
60 find any meta-data in the files, you may not get any results. You can
61 use the option -l to specify non-standard libextractor plugins. For
62 example, doodle could be used to replace the locate tool from the GNU
63 findutils like this:
64
65 $ alias updatedb="doodle -bn -d ~/.doodle-locate-db -l libex‐
66 tractor_filename /"
67
68 $ alias locate="doodle -d ~/.doodle-locate-db"
69
70
71
73 -a NUMBER, --approximate=NUMBER
74 do approximate matching with mismatches of up to NUMBER letters
75
76 -b, --build
77 build the doodle database (passed arguments are directories and
78 filenames that are to be indexed). In comparison with GNU
79 locate the doodle binary encapsulates both the locate and the
80 updatedb tool. Using the -b option doodle builds or updates the
81 database (equivalent to updatedb), without -b it behaves similar
82 to locate.
83
84 -B LANG, --binary=LANG
85 Use the generic plaintext extractor for the language with the
86 2-letter language code LANG. Supported languages are DA (Dan‐
87 ish), DE (German), EN (English), ES (Spanish), IT (Italian) and
88 NO (Norwegian). Use this option to enable fulltext indexing
89 (for a particular language). This option only makes sense
90 together with the -b option.
91
92 -d FILENAME, --database=FILENAME
93 use FILENAME for the location of the database (use when building
94 or searching). This option is particularly useful when doodle
95 is used to search different types of files (or is operated with
96 different extractor options). Using this option doodle can be
97 used to build specialized indices (i.e. one per file system),
98 which can in turn improve search performance. When searching,
99 you can pass a colon-separated list of database file names, in
100 that case all databases are searched. Note that the disk-space
101 consumption of a single database is typically slightly smaller
102 than if the database is split into multiple files. Neverthe‐
103 less, the space-savings are likely to be small (a few percent).
104 You can also use the environment variable DOODLE_PATH to set
105 the list of database files to search. The option overrides the
106 environment variable if both are used. If the option is not
107 given and DOODLE_PATH is not set, "~/.doodle" is used.
108
109 -e, --extract
110 print the extracted keywords for each matching file found. Note
111 that this will slow down the program a lot, especially if there
112 are many matches in the database. Note that if the options
113 given for libextractor are different than the options used for
114 building the index the results may not contain the search
115 string.
116
117 -f, --filenames
118 include filenames (full path) in the set of keywords
119
120 -h, --help
121 print help page
122
123 -H ALGORITHM, --hash=ALGORITHM
124 Use the ALGORITHM to compute a hash of each file (possible algo‐
125 rithms are sha1 and md5).
126
127 -i, --ignore-case
128 be case-insensitive
129
130 -l LIBRARIES, --library=LIBRARIES
131 specify which libextractor plugins to use (for building the
132 index with -b or for printing information about files with -e)
133
134 -L FILENAME, --log=FILENAME
135 log all encountered keywords into a log file named FILENAME.
136 This option is mostly useful for debugging.
137
138 -m LIMIT, --memory=LIMIT
139 use at most LIMIT MB of memory for the nodes of the suffix-tree
140 (after that, serialize to disk). Note that a smaller value will
141 reduce memory consumption but increase the size of the temporary
142 file (and slow down indexing). The default is 8 MB.
143
144 -n, --nodefault
145 do not load the default set of plugins (only load plugins speci‐
146 fied with -l)
147
148 -p, --print
149 make a human-readable screen dump of the doodle database (only
150 really useful for debugging)
151
152 -P PATH, --prunepaths=PATH
153 Directories to not put in the database, which would otherwise
154 be. The environment variable PRUNEPATHS also sets this value.
155 Default is "/tmp /usr/tmp /var/tmp /dev /proc /sys". This
156 option can also be used when searching, in which case search
157 results in the specified directories will be ignored.
158
159 -v, --version
160 print the version number
161
162 -V, --verbose
163 be verbose
164
165
167 DOODLE_PATH
168 Colon-separated list of databases to search. Note that when
169 building the database this path must either only contain one
170 filename or the option -b must be used to specify the database
171 file. Default is "~/.doodle".
172
173 PRUNEPATHS
174 Space-separated list of paths to exclude. Can be overridden
175 with the -P option.
176
177
179 Doodle depends on libextractor. You can download libextractor from
180 http://gnunet.org/libextractor/.
181
182
184 extract(1), slocate(1), updatedb(1), libextractor(3), libdoodle(3)
185
186
188 libdoodle and doodle are released under the GPL.
189
190
192 Report bugs to mantis <http://gnunet.org/mantis/> or by sending elec‐
193 tronic mail to <christian@grothoff.org>
194
195
197 doodle was originally written by Christian Grothoff <chris‐
198 tian@grothoff.org>.
199
200
202 You can obtain the original author's latest version from
203 http://gnunet.org/doodle/.
204
205
206
207doodle Jun 29 2005 DOODLE(1)