1Unicode::CaseFold(3)  User Contributed Perl Documentation Unicode::CaseFold(3)
2
3
4

NAME

6       Unicode::CaseFold - Unicode case-folding for case-insensitive lookups.
7

VERSION

9       version 1.01
10

SYNOPSIS

12           use Unicode::CaseFold;
13
14           my $folded = fc $string;
15
16   What is Case-Folding?
17       In non-Unicode contexts, a common idiom to compare two strings case-
18       insensitively is "lc($this) eq lc($that)". Before comparing two strings
19       we normalize them to an all-lowercase version. "Hello", "HELLO", and
20       "HeLlO" all have the same lowercase form ("hello"), so it doesn't
21       matter which one we start with; they are all equal to one another after
22       "lc".
23
24       In Unicode, things aren't so simple. A Unicode character might have
25       mappings for uppercase, lowercase, and titlecase, and the lowercase
26       mapping of the uppercase mapping of a given character might not be the
27       character that you started with! For example "lc(uc("\N{LATIN SMALL
28       LETTER SHARP S"))" is "ss", not the eszett we started off with! Case-
29       folding is a part of the Unicode standard that allows any two strings
30       that differ from one another only by case to map to the same "case-
31       folded" form, even when those strings include characters with complex
32       case-mappings.
33
34   Use for Case-insensitive Comparison
35       Simply write "fc($this) eq fc($that)" instead of "lc($this) eq
36       lc($that)".  You can also use "index" on case-folded strings for
37       substring search.
38
39   Use for String Lookups
40       Frequently we want to store data in a hash, or a database, or an
41       external file for later retrieval. Sometimes we want to be able to
42       match the keys in this data case-insensitively -- that is, we should be
43       able to store some data under the key "hello" and later retrieve it
44       with the key "HELLO". Some databases have complete support for
45       collation, but in other databases the support is missing or broken, and
46       Perl hashes don't support it at all. By making case-folding part of the
47       process you use to normalize your keys before using them to access a
48       database or data structure, you get case-insensitive lookup.
49
50           $roles{fc "Samuel L. Jackson"} = ["Gin Rummy", "Nick Fury", "Mace Windu"];
51
52           $roles = $roles{fc "Samuel l. JACKSON"}; # Gets the data.
53

DESCRIPTION

55       This module provides Unicode case-folding for Perl. Case-folding is a
56       tool that allows a program to make case-insensitive string comparisons
57       or do case-insensitive lookups.
58

EXPORTS

60   fc($str)
61       Exported by default when you use the module. "use Unicode::CaseFold ()"
62       or "use Unicode::CaseFold qw(case_fold !fc)" if you don't want it to be
63       exported.
64
65       Returns the case-folded version of $str. This function is prototyped to
66       act as much as possible like the built-ins "lc" and "uc"; it imposes a
67       scalar context on its argument, and if called with no argument it will
68       return the case-folded version of $_.
69
70   case_fold($str)
71       Exported on request. Just like "fc", except that it has no prototype
72       and won't case-fold $_ if called without an argument.
73

VARIABLES

75   $Unicode::CaseFold::XS
76       Whether the XS extension is in use. The pure-perl implementation is
77       5-10 times slower than the XS extension, and on versions of perl before
78       5.10.0 it will use simple case-folding instead of full case-folding
79       (see below).
80
81   $Unicode::CaseFold::SIMPLE_FOLDING
82       Is set to true if the perl version is prior to 5.10.0 and the XS
83       extension is not available. In this case, "fc" will perform a simple
84       case-folding instead of a full case-folding. Although relatively few
85       characters are affected, strings case-folded using simple folding might
86       not compare equal to the corresponding strings case-folded with full
87       folding, which may cause compatibility issues.
88
89       Furthermore, when simple folding is in use, some strings that would
90       have case-folded to the same value when using full folding will instead
91       case-fold to different values. For example, "fc("Wei\x{df}")" and
92       "fc("Weiss")" both produce "weiss" when full folding is in effect, but
93       the former produces "wei\x{df}" when using simple folding.
94
95       If you want to check for this potentially dangerous situation, consult
96       the $Unicode::CaseFold::SIMPLE_FOLDING variable.
97

COMPATIBILITY

99       •   "Unicode::CaseFold" requires Perl 5.8.1 or newer.
100
101       •   Different versions of perl include different versions of the
102           Unicode database, which is revised over time. If you are likely to
103           be comparing strings that have been folded using different versions
104           of perl, you may need to consult the changes for intervening
105           Unicode standard versions to find out whether your code will work
106           correctly.
107
108       •   "Unicode::CaseFold" uses "simple" rather than "full" case-folding
109           when operating in Pure-perl mode on perl versions previous to
110           5.10.0. For compatibility implications, see
111           "$Unicode::CaseFold::SIMPLE_FOLDING".
112

SEE ALSO

114       •   <http://unicode.org/reports/tr21/tr21-5.html>: Unicode Standard
115           Annex #21: Case Mappings
116

AUTHOR

118       Andrew Rodland <arodland@cpan.org>
119
121       This software is copyright (c) 2017 by Andrew Rodland.
122
123       This is free software; you can redistribute it and/or modify it under
124       the same terms as the Perl 5 programming language system itself.
125
126
127
128perl v5.36.0                      2022-07-22              Unicode::CaseFold(3)
Impressum