1Lucy(3) User Contributed Perl Documentation Lucy(3)
2
3
4
6 Lucy - Apache Lucy search engine library.
7
9 0.6.2
10
12 First, plan out your index structure, create the index, and add
13 documents:
14
15 # indexer.pl
16
17 use Lucy::Index::Indexer;
18 use Lucy::Plan::Schema;
19 use Lucy::Analysis::EasyAnalyzer;
20 use Lucy::Plan::FullTextType;
21
22 # Create a Schema which defines index fields.
23 my $schema = Lucy::Plan::Schema->new;
24 my $easyanalyzer = Lucy::Analysis::EasyAnalyzer->new(
25 language => 'en',
26 );
27 my $type = Lucy::Plan::FullTextType->new(
28 analyzer => $easyanalyzer,
29 );
30 $schema->spec_field( name => 'title', type => $type );
31 $schema->spec_field( name => 'content', type => $type );
32
33 # Create the index and add documents.
34 my $indexer = Lucy::Index::Indexer->new(
35 schema => $schema,
36 index => '/path/to/index',
37 create => 1,
38 );
39 while ( my ( $title, $content ) = each %source_docs ) {
40 $indexer->add_doc({
41 title => $title,
42 content => $content,
43 });
44 }
45 $indexer->commit;
46
47 Then, search the index:
48
49 # search.pl
50
51 use Lucy::Search::IndexSearcher;
52
53 my $searcher = Lucy::Search::IndexSearcher->new(
54 index => '/path/to/index'
55 );
56 my $hits = $searcher->hits( query => "foo bar" );
57 while ( my $hit = $hits->next ) {
58 print "$hit->{title}\n";
59 }
60
62 The Apache Lucy search engine library delivers high-performance,
63 modular full-text search.
64
65 Features
66 · Extremely fast. A single machine can handle millions of documents.
67
68 · Scalable to multiple machines.
69
70 · Incremental indexing (addition/deletion of documents to/from an
71 existing index).
72
73 · Configurable near-real-time index updates.
74
75 · Unicode support.
76
77 · Support for boolean operators AND, OR, and AND NOT; parenthetical
78 groupings; prepended +plus and -minus.
79
80 · Algorithmic selection of relevant excerpts and highlighting of
81 search terms within excerpts.
82
83 · Highly customizable query and indexing APIs.
84
85 · Customizable sorting.
86
87 · Phrase matching.
88
89 · Stemming.
90
91 · Stoplists.
92
93 Getting Started
94 Lucy::Simple provides a stripped down API which may suffice for many
95 tasks.
96
97 Lucy::Docs::Tutorial demonstrates how to build a basic CGI search
98 application.
99
100 The tutorial spends most of its time on these five classes:
101
102 · Lucy::Plan::Schema - Plan out your index.
103
104 · Lucy::Plan::FieldType - Define index fields.
105
106 · Lucy::Index::Indexer - Manipulate index content.
107
108 · Lucy::Search::IndexSearcher - Search an index.
109
110 · Lucy::Analysis::EasyAnalyzer - A one-size-fits-all
111 parser/tokenizer.
112
113 Delving Deeper
114 Lucy::Docs::Cookbook augments the tutorial with more advanced recipes.
115
116 For creating complex queries, see Lucy::Search::Query and its
117 subclasses TermQuery, PhraseQuery, ANDQuery, ORQuery, NOTQuery,
118 RequiredOptionalQuery, MatchAllQuery, and NoMatchQuery, plus
119 Lucy::Search::QueryParser.
120
121 For distributed searching, see LucyX::Remote::SearchServer,
122 LucyX::Remote::SearchClient, and LucyX::Remote::ClusterSearcher.
123
124 Backwards Compatibility Policy
125 Lucy will spin off stable forks into new namespaces periodically. The
126 first will be named "Lucy1". Users who require strong backwards
127 compatibility should use a stable fork.
128
129 The main namespace, "Lucy", is an API-unstable development branch (as
130 hinted at by its 0.x.x version number). Superficial interface changes
131 happen frequently. Hard file format compatibility breaks which require
132 reindexing are rare, as we generally try to provide continuity across
133 multiple releases, but we reserve the right to make such changes.
134
136 The Lucy module itself does not have a large interface, providing only
137 a single public class method.
138
139 error
140 my $instream = $folder->open_in( file => 'foo' ) or die Clownfish->error;
141
142 Access a shared variable which is set by some routines on failure. It
143 will always be either a Clownfish::Err object or undef.
144
146 The Apache Lucy homepage, where you'll find links to our mailing lists
147 and so on, is <http://lucy.apache.org>. Please direct support
148 questions to the Lucy users mailing list.
149
151 Not thread-safe.
152
153 Some exceptions leak memory.
154
155 If you find a bug, please inquire on the Lucy users mailing list about
156 it, then report it on the Lucy issue tracker once it has been
157 confirmed: <https://issues.apache.org/jira/browse/LUCY>.
158
160 Apache Lucy is distributed under the Apache License, Version 2.0, as
161 described in the file "LICENSE" included with the distribution.
162
163
164
165perl v5.32.0 2020-07-28 Lucy(3)