1Lucy::Index::Indexer(3)User Contributed Perl DocumentatioLnucy::Index::Indexer(3)
2
3
4

NAME

6       Lucy::Index::Indexer - Build inverted indexes.
7

SYNOPSIS

9           my $indexer = Lucy::Index::Indexer->new(
10               schema => $schema,
11               index  => '/path/to/index',
12               create => 1,
13           );
14           while ( my ( $title, $content ) = each %source_docs ) {
15               $indexer->add_doc({
16                   title   => $title,
17                   content => $content,
18               });
19           }
20           $indexer->commit;
21

DESCRIPTION

23       The Indexer class is Apache LucyXs primary tool for managing the
24       content of inverted indexes, which may later be searched using
25       IndexSearcher.
26
27       In general, only one Indexer at a time may write to an index safely.
28       If a write lock cannot be secured, new() will throw an exception.
29
30       If an index is located on a shared volume, each writer application must
31       identify itself by supplying an IndexManager with a unique "host" id to
32       IndexerXs constructor or index corruption will occur.  See FileLocking
33       for a detailed discussion.
34
35       Note: at present, delete_by_term() and delete_by_query() only affect
36       documents which had been previously committed to the index X and not
37       any documents added this indexing session but not yet committed.  This
38       may change in a future update.
39

CONSTRUCTORS

41   new
42           my $indexer = Lucy::Index::Indexer->new(
43               schema   => $schema,             # required at index creation
44               index    => '/path/to/index',    # required
45               create   => 1,                   # default: 0
46               truncate => 1,                   # default: 0
47               manager  => $manager             # default: created internally
48           );
49
50       ·   schema - A Schema.  Required when index is being created; if not
51           supplied, will be extracted from the index folder.
52
53       ·   index - Either a filepath to an index or a Folder.
54
55       ·   create - If true and the index directory does not exist, attempt to
56           create it.
57
58       ·   truncate - If true, proceed with the intention of discarding all
59           previous indexing data.  The old data will remain intact and
60           visible until commit() succeeds.
61
62       ·   manager - An IndexManager.
63

METHODS

65   add_doc
66           $indexer->add_doc($doc);
67           $indexer->add_doc( { field_name => $field_value } );
68           $indexer->add_doc(
69               doc   => { field_name => $field_value },
70               boost => 2.5,         # default: 1.0
71           );
72
73       Add a document to the index.  Accepts either a single argument or
74       labeled params.
75
76       ·   doc - Either a Lucy::Document::Doc object, or a hashref (which will
77           be attached to a Lucy::Document::Doc object internally).
78
79       ·   boost - A floating point weight which affects how this document
80           scores.
81
82   add_index
83           $indexer->add_index($index);
84
85       Absorb an existing index into this one.  The two indexes must have
86       matching Schemas.
87
88       ·   index - Either an index path name or a Folder.
89
90   delete_by_term
91           $indexer->delete_by_term(
92               field => $field,  # required
93               term  => $term,   # required
94           );
95
96       Mark documents which contain the supplied term as deleted, so that they
97       will be excluded from search results and eventually removed altogether.
98       The change is not apparent to search apps until after commit()
99       succeeds.
100
101       ·   field - The name of an indexed field. (If it is not specXd as
102           "indexed", an error will occur.)
103
104       ·   term - The term which identifies docs to be marked as deleted.  If
105           "field" is associated with an Analyzer, "term" will be processed
106           automatically (so donXt pre-process it yourself).
107
108   delete_by_query
109           $indexer->delete_by_query($query);
110
111       Mark documents which match the supplied Query as deleted.
112
113       ·   query - A Query.
114
115   delete_by_doc_id
116           $indexer->delete_by_doc_id($doc_id);
117
118       Mark the document identified by the supplied document ID as deleted.
119
120       ·   doc_id - A document id.
121
122   optimize
123           $indexer->optimize();
124
125       Optimize the index for search-time performance.  This may take a while,
126       as it can involve rewriting large amounts of data.
127
128       Every Indexer session which changes index content and ends in a
129       commit() creates a new segment.  Once written, segments are never
130       modified.  However, they are periodically recycled by feeding their
131       content into the segment currently being written.
132
133       The optimize() method causes all existing index content to be fed back
134       into the Indexer.  When commit() completes after an optimize(), the
135       index will consist of one segment.  So optimize() must be called before
136       commit().  Also, optimizing a fresh index created from scratch has no
137       effect.
138
139       Historically, there was a significant search-time performance benefit
140       to collapsing down to a single segment versus even two segments.  Now
141       the effect of collapsing is much less significant, and calling
142       optimize() is rarely justified.
143
144   commit
145           $indexer->commit();
146
147       Commit any changes made to the index.  Until this is called, none of
148       the changes made during an indexing session are permanent.
149
150       Calling commit() invalidates the Indexer, so if you want to make more
151       changes youXll need a new one.
152
153   prepare_commit
154           $indexer->prepare_commit();
155
156       Perform the expensive setup for commit() in advance, so that commit()
157       completes quickly.  (If prepare_commit() is not called explicitly by
158       the user, commit() will call it internally.)
159
160   get_schema
161           my $schema = $indexer->get_schema();
162
163       Accessor for schema.
164

INHERITANCE

166       Lucy::Index::Indexer isa Clownfish::Obj.
167
168
169
170perl v5.32.0                      2020-07-28           Lucy::Index::Indexer(3)
Impressum