1Lucy::Index::DataWriterU(s3eprm)Contributed Perl DocumenLtuactyi:o:nIndex::DataWriter(3pm)
2
3
4
6 Lucy::Index::DataWriter - Write data to an index.
7
9 # Abstract base class.
10
12 DataWriter is an abstract base class for writing index data, generally
13 in segment-sized chunks. Each component of an index – e.g. stored
14 fields, lexicon, postings, deletions – is represented by a
15 DataWriter/DataReader pair.
16
17 Components may be specified per index by subclassing Architecture.
18
20 new
21 my $writer = MyDataWriter->new(
22 snapshot => $snapshot, # required
23 segment => $segment, # required
24 polyreader => $polyreader, # required
25 );
26
27 Abstract constructor.
28
29 • snapshot - The Snapshot that will be committed at the end of the
30 indexing session.
31
32 • segment - The Segment in progress.
33
34 • polyreader - A PolyReader representing all existing data in the
35 index. (If the index is brand new, the PolyReader will have no
36 sub-readers).
37
39 add_segment
40 $data_writer->add_segment(
41 reader => $reader, # required
42 doc_map => $doc_map, # default: undef
43 );
44
45 Add content from an existing segment into the one currently being
46 written.
47
48 • reader - The SegReader containing content to add.
49
50 • doc_map - An array of integers mapping old document ids to new.
51 Deleted documents are mapped to 0, indicating that they should be
52 skipped.
53
54 finish
55 $data_writer->finish();
56
57 Complete the segment: close all streams, store metadata, etc.
58
59 format
60 my $int = $data_writer->format();
61
62 Every writer must specify a file format revision number, which should
63 increment each time the format changes. Responsibility for revision
64 checking is left to the companion DataReader.
65
67 delete_segment
68 $data_writer->delete_segment($reader);
69
70 Remove a segment’s data. The default implementation is a no-op, as all
71 files within the segment directory will be automatically deleted.
72 Subclasses which manage their own files outside of the segment system
73 should override this method and use it as a trigger for cleaning up
74 obsolete data.
75
76 • reader - The SegReader containing content to merge, which must
77 represent a segment which is part of the the current snapshot.
78
79 merge_segment
80 $data_writer->merge_segment(
81 reader => $reader, # required
82 doc_map => $doc_map, # default: undef
83 );
84
85 Move content from an existing segment into the one currently being
86 written.
87
88 The default implementation calls add_segment() then delete_segment().
89
90 • reader - The SegReader containing content to merge, which must
91 represent a segment which is part of the the current snapshot.
92
93 • doc_map - An array of integers mapping old document ids to new.
94 Deleted documents are mapped to 0, indicating that they should be
95 skipped.
96
97 metadata
98 my $hashref = $data_writer->metadata();
99
100 Arbitrary metadata to be serialized and stored by the Segment. The
101 default implementation supplies a hash with a single key-value pair for
102 “format”.
103
104 get_snapshot
105 my $snapshot = $data_writer->get_snapshot();
106
107 Accessor for “snapshot” member var.
108
109 get_segment
110 my $segment = $data_writer->get_segment();
111
112 Accessor for “segment” member var.
113
114 get_polyreader
115 my $poly_reader = $data_writer->get_polyreader();
116
117 Accessor for “polyreader” member var.
118
119 get_schema
120 my $schema = $data_writer->get_schema();
121
122 Accessor for “schema” member var.
123
124 get_folder
125 my $folder = $data_writer->get_folder();
126
127 Accessor for “folder” member var.
128
130 Lucy::Index::DataWriter isa Clownfish::Obj.
131
132
133
134perl v5.38.0 2023-07-20 Lucy::Index::DataWriter(3pm)