1KinoSearch1::Analysis::UPsoelryACnoanltyrziebru(t3e)d PeKrilnoDSoecaurmcehn1t:a:tAinoanlysis::PolyAnalyzer(3)
2
3
4

NAME

6       KinoSearch1::Analysis::PolyAnalyzer - multiple analyzers in series
7

SYNOPSIS

9           my $analyzer = KinoSearch1::Analysis::PolyAnalyzer->new(
10               language  => 'es',
11           );
12
13           # or...
14           my $analyzer = KinoSearch1::Analysis::PolyAnalyzer->new(
15               analyzers => [
16                   $lc_normalizer,
17                   $custom_tokenizer,
18                   $snowball_stemmer,
19               ],
20           );
21

DESCRIPTION

23       A PolyAnalyzer is a series of Analyzers -- objects which inherit from
24       KinoSearch1::Analysis::Analyzer -- each of which will be called upon to
25       "analyze" text in turn.  You can either provide the Analyzers yourself,
26       or you can specify a supported language, in which case a PolyAnalyzer
27       consisting of an LCNormalizer, a Tokenizer, and a Stemmer will be
28       generated for you.
29
30       Supported languages:
31
32           en => English,
33           da => Danish,
34           de => German,
35           es => Spanish,
36           fi => Finnish,
37           fr => French,
38           it => Italian,
39           nl => Dutch,
40           no => Norwegian,
41           pt => Portuguese,
42           ru => Russian,
43           sv => Swedish,
44

CONSTRUCTOR

46   new()
47           my $analyzer = KinoSearch1::Analysis::PolyAnalyzer->new(
48               language   => 'en',
49           );
50
51       Construct a PolyAnalyzer object.  If the parameter "analyzers" is
52       specified, it will override "language" and no attempt will be made to
53       generate a default set of Analyzers.
54
55       ·   language - Must be an ISO code from the list of supported
56           languages.
57
58       ·   analyzers - Must be an arrayref.  Each element in the array must
59           inherit from KinoSearch1::Analysis::Analyzer.  The order of the
60           analyzers matters.  Don't put a Stemmer before a Tokenizer (can't
61           stem whole documents or paragraphs -- just individual words), or a
62           Stopalizer after a Stemmer (stemmed words, e.g. "themselv", will
63           not appear in a stoplist).  In general, the sequence should be:
64           normalize, tokenize, stopalize, stem.
65
67       Copyright 2005-2010 Marvin Humphrey
68

LICENSE, DISCLAIMER, BUGS, etc.

70       See KinoSearch1 version 1.00.
71
72
73
74perl v5.12.2                      2010-10K-i0n5oSearch1::Analysis::PolyAnalyzer(3)
Impressum