1Lucy::Analysis::NormaliUzseerr(3Cpomn)tributed Perl DocuLmuecnyt:a:tAinoanlysis::Normalizer(3pm)
2
3
4

NAME

6       Lucy::Analysis::Normalizer - Unicode normalization, case folding and
7       accent stripping.
8

SYNOPSIS

10           my $normalizer = Lucy::Analysis::Normalizer->new;
11
12           my $polyanalyzer = Lucy::Analysis::PolyAnalyzer->new(
13               analyzers => [ $tokenizer, $normalizer, $stemmer ],
14           );
15

DESCRIPTION

17       Normalizer is an Analyzer which normalizes tokens to one of the Unicode
18       normalization forms. Optionally, it performs Unicode case folding and
19       converts accented characters to their base character.
20
21       If you use highlighting, Normalizer should be run after tokenization
22       because it might add or remove characters.
23

CONSTRUCTORS

25   new
26           my $normalizer = Lucy::Analysis::Normalizer->new(
27               normalization_form => 'NFKC',
28               case_fold          => 1,
29               strip_accents      => 0,
30           );
31
32       Create a new Normalizer.
33
34normalization_form - Unicode normalization form, can be one of
35           ‘NFC’, ‘NFKC’, ‘NFD’, ‘NFKD’. Defaults to ‘NFKC’.
36
37case_fold - Perform case folding, default is true.
38
39strip_accents - Strip accents, default is false.
40

METHODS

42   transform
43           my $inversion = $normalizer->transform($inversion);
44
45       Take a single Inversion as input and returns an Inversion, either the
46       same one (presumably transformed in some way), or a new one.
47
48inversion - An inversion.
49

INHERITANCE

51       Lucy::Analysis::Normalizer isa Lucy::Analysis::Analyzer isa
52       Clownfish::Obj.
53
54
55
56perl v5.38.0                      2023-07-20   Lucy::Analysis::Normalizer(3pm)
Impressum