1Unicode::Casing(3) User Contributed Perl Documentation Unicode::Casing(3)
2
3
4
6 Unicode::Casing - Perl extension to override system case changing
7 functions
8
10 use Unicode::Casing
11 uc => \&my_uc, lc => \&my_lc,
12 ucfirst => \&my_ucfirst, lcfirst => \&my_lcfirst,
13 fc => \&my_fc;
14 no Unicode::Casing;
15
16 package foo::bar;
17 use Unicode::Casing -load;
18 sub import {
19 Unicode::Casing->import(
20 uc => \&_uc,
21 lc => \&_lc,
22 ucfirst => \&_ucfirst,
23 lcfirst => \&_lcfirst,
24 fc => \&_fc,
25 );
26 }
27 sub unimport {
28 Unicode::Casing->unimport;
29 }
30
32 This module allows overriding the system-defined character case
33 changing operations. Any time something in its lexical scope would
34 ordinarily call lc(), lcfirst(), uc(), ucfirst(), or fc(), the
35 corresponding user-specified function will instead be called. This
36 applies to direct calls (even those prefaced by "CORE::"), and indirect
37 calls via the "\L", "\l", "\U", "\u", and "\F" escapes in double-quoted
38 strings and regular expressions.
39
40 Each function is passed a string whose case is to be changed, and
41 should return the case-changed version of that string. Within the
42 function's dynamic scope, references to the operation it is overriding
43 use the non-overridden version. For example:
44
45 sub my_uc {
46 my $string = shift;
47 print "Debugging information\n";
48 return uc($string);
49 }
50 use Unicode::Casing uc => \&my_uc;
51 uc($foo);
52
53 gives the standard upper-casing behavior, but prints "Debugging
54 information" first. This also applies to the escapes. Using, for
55 example, "\U" inside the override function for uc() will call the non-
56 overridden uc(). Since this applies across the dynamic scope, if
57 "my_uc" calls function "a" which calls "b" which calls "c" which calls
58 "uc", that "uc" is the non-overridden version. Otherwise there would
59 be the possibility of infinite recursion. And, it fits with the
60 typical use of these functions, which is to use the standard case
61 change except for a few select characters, as shown in the example
62 below.
63
64 It is an error to not specify at least one override in the "use"
65 statement. Ones not specified use the standard operation. It is also
66 an error to specify more than one override for the same function.
67
68 "use re 'eval'" is not needed to have the inline case-changing
69 sequences work in regular expressions.
70
71 Here's an example of a real-life application, for Turkish, that shows
72 context-sensitive case-changing. (Because of bugs in earlier Perls,
73 version v5.12 is required for this example to work properly.)
74
75 sub turkish_lc($) {
76 my $string = shift;
77
78 # Unless an I is before a dot_above, it turns into a dotless i (the
79 # dot above being attached to the I, without an intervening other
80 # Above mark; an intervening non-mark (ccc=0) would mean that the
81 # dot above would be attached to that character and not the I)
82 $string =~ s/I (?! [^\p{ccc=0}\p{ccc=Above}]* \x{0307} )/\x{131}/gx;
83
84 # But when the I is followed by a dot_above, remove the dot_above so
85 # the end result will be i.
86 $string =~ s/I ([^\p{ccc=0}\p{ccc=Above}]* ) \x{0307}/i$1/gx;
87
88 $string =~ s/\x{130}/i/g;
89
90 return lc($string);
91 }
92
93 A potential problem with context-dependent case changing is that the
94 routine may be passed insufficient context, especially with the in-line
95 escapes like "\L".
96
97 90turkish.t, which comes with the distribution includes a full
98 implementation of all the Turkish casing rules.
99
100 Note that there are problems with the standard case changing operation
101 for characters whose code points are between 128 and 255. To get the
102 correct Unicode behavior, the strings must be encoded in utf8 (which
103 the override functions can force) or calls to the operations must be
104 within the scope of "use feature 'unicode_strings'" (which is available
105 starting in Perl version 5.12).
106
107 Also, note that fc() and "\F" are available only in Perls starting with
108 version v5.15.8. Trying to override them on earlier versions will
109 result in a fatal error.
110
111 Note that there can be problems installing this (at least on Windows)
112 if using an old version of ExtUtils::Depends. To get around this follow
113 these steps:
114
115 1. upgrade ExtUtils::Depends
116
117 2. force install B::Hooks::OP::Check
118
119 3. force install B::Hooks::OP::PPAddr
120
121 See <http://perlmonks.org/?node_id=797851>.
122
124 This module doesn't play well when there are other attempts to override
125 the functions, such as "use subs qw(uc lc ...);" or
126 "*CORE::GLOBAL::uc = sub { .... };". Which thing gets called depends
127 on the ordering of the calls, and scoping rules break down.
128
130 Karl Williamson, "<khw@cpan.org>", with advice and guidance from
131 various Perl 5 porters, including Paul Evans, Burak Gürsoy, Florian
132 Ragwitz, Ricardo Signes, and Matt S. Trout.
133
135 Copyright (C) 2011 by Karl Williamson
136
137 This library is free software; you can redistribute it and/or modify it
138 under the same terms as Perl itself, either Perl version 5.10.1 or, at
139 your option, any later version of Perl 5 you may have available.
140
141
142
143perl v5.38.0 2023-07-21 Unicode::Casing(3)