1Unicode::Casing(3)    User Contributed Perl Documentation   Unicode::Casing(3)
2
3
4

NAME

6       Unicode::Casing - Perl extension to override system case changing
7       functions
8

SYNOPSIS

10         use Unicode::Casing
11                   uc => \&my_uc, lc => \&my_lc,
12                   ucfirst => \&my_ucfirst, lcfirst => \&my_lcfirst,
13                   fc => \&my_fc;
14         no Unicode::Casing;
15
16         package foo::bar;
17           use Unicode::Casing -load;
18           sub import {
19               Unicode::Casing->import(
20                   uc      => \&_uc,
21                   lc      => \&_lc,
22                   ucfirst => \&_ucfirst,
23                   lcfirst => \&_lcfirst,
24                   fc => \&_fc,
25               );
26           }
27           sub unimport {
28               Unicode::Casing->unimport;
29           }
30

DESCRIPTION

32       This module allows overriding the system-defined character case
33       changing operations.  Any time something in its lexical scope would
34       ordinarily call "lc()", "lcfirst()", "uc()", "ucfirst()", or "fc()",
35       the corresponding user-specified function will instead be called.  This
36       applies to direct calls (even those prefaced by "CORE::"), and indirect
37       calls via the "\L", "\l", "\U", "\u", and "\F" escapes in double-quoted
38       strings and regular expressions.
39
40       Each function is passed a string whose case is to be changed, and
41       should return the case-changed version of that string.  Within the
42       function's dynamic scope, references to the operation it is overriding
43       use the non-overridden version.  For example:
44
45        sub my_uc {
46           my $string = shift;
47           print "Debugging information\n";
48           return uc($string);
49        }
50        use Unicode::Casing uc => \&my_uc;
51        uc($foo);
52
53       gives the standard upper-casing behavior, but prints "Debugging
54       information" first.  This also applies to the escapes.  Using, for
55       example, "\U" inside the override function for "uc()" will call the
56       non-overridden "uc()".  Since this applies across the dynamic scope, if
57       "my_uc" calls function "a" which calls "b" which calls "c" which calls
58       "uc", that "uc" is the non-overridden version.  Otherwise there would
59       be the possibility of infinite recursion.  And, it fits with the
60       typical use of these functions, which is to use the standard case
61       change except for a few select characters, as shown in the example
62       below.
63
64       It is an error to not specify at least one override in the "use"
65       statement.  Ones not specified use the standard operation.  It is also
66       an error to specify more than one override for the same function.
67
68       "use re 'eval'" is not needed to have the inline case-changing
69       sequences work in regular expressions.
70
71       Here's an example of a real-life application, for Turkish, that shows
72       context-sensitive case-changing.  (Because of bugs in earlier Perls,
73       version v5.12 is required for this example to work properly.)
74
75        sub turkish_lc($) {
76           my $string = shift;
77
78           # Unless an I is before a dot_above, it turns into a dotless i (the
79           # dot above being attached to the I, without an intervening other
80           # Above mark; an intervening non-mark (ccc=0) would mean that the
81           # dot above would be attached to that character and not the I)
82           $string =~ s/I (?! [^\p{ccc=0}\p{ccc=Above}]* \x{0307} )/\x{131}/gx;
83
84           # But when the I is followed by a dot_above, remove the dot_above so
85           # the end result will be i.
86           $string =~ s/I ([^\p{ccc=0}\p{ccc=Above}]* ) \x{0307}/i$1/gx;
87
88           $string =~ s/\x{130}/i/g;
89
90           return lc($string);
91        }
92
93       A potential problem with context-dependent case changing is that the
94       routine may be passed insufficient context, especially with the in-line
95       escapes like "\L".
96
97       90turkish.t, which comes with the distribution includes a full
98       implementation of all the Turkish casing rules.
99
100       Note that there are problems with the standard case changing operation
101       for characters whose code points are between 128 and 255.  To get the
102       correct Unicode behavior, the strings must be encoded in utf8 (which
103       the override functions can force) or calls to the operations must be
104       within the scope of "use feature 'unicode_strings'" (which is available
105       starting in Perl version 5.12).
106
107       Also, note that "fc()" and "\F" are available only in Perls starting
108       with version v5.15.8.  Trying to override them on earlier versions will
109       result in a fatal error.
110
111       Note that there can be problems installing this (at least on Windows)
112       if using an old version of ExtUtils::Depends. To get around this follow
113       these steps:
114
115       1.  upgrade ExtUtils::Depends
116
117       2.  force install B::Hooks::OP::Check
118
119       3.  force install B::Hooks::OP::PPAddr
120
121       See <http://perlmonks.org/?node_id=797851>.
122

BUGS

124       This module doesn't play well when there are other attempts to override
125       the functions, such as "use subs qw(uc lc ...);" or
126       "*CORE::GLOBAL::uc = sub { .... };".  Which thing gets called depends
127       on the ordering of the calls, and scoping rules break down.
128

AUTHOR

130       Karl Williamson, "<khw@cpan.org>", with advice and guidance from
131       various Perl 5 porters, including Paul Evans, Burak Gürsoy, Florian
132       Ragwitz, Ricardo Signes, and Matt S. Trout.
133
135       Copyright (C) 2011 by Karl Williamson
136
137       This library is free software; you can redistribute it and/or modify it
138       under the same terms as Perl itself, either Perl version 5.10.1 or, at
139       your option, any later version of Perl 5 you may have available.
140
141
142
143perl v5.30.1                      2020-01-30                Unicode::Casing(3)
Impressum