1utf8::all(3)          User Contributed Perl Documentation         utf8::all(3)
2
3
4

NAME

6       utf8::all - turn on Unicode - all of it
7

VERSION

9       version 0.024
10

SYNOPSIS

12           use utf8::all;                      # Turn on UTF-8, all of it.
13
14           open my $in, '<', 'contains-utf8';  # UTF-8 already turned on here
15           print length 'føø bār';             # 7 UTF-8 characters
16           my $utf8_arg = shift @ARGV;         # @ARGV is UTF-8 too (only for main)
17

DESCRIPTION

19       The "use utf8" pragma tells the Perl parser to allow UTF-8 in the
20       program text in the current lexical scope. This also means that you can
21       now use literal Unicode characters as part of strings, variable names,
22       and regular expressions.
23
24       "utf8::all" goes further:
25
26       •   "charnames" are imported so "\N{...}" sequences can be used to
27           compile Unicode characters based on names.
28
29       •   On Perl "v5.11.0" or higher, the "use feature 'unicode_strings'" is
30           enabled.
31
32       •   "use feature fc" and "use feature unicode_eval" are enabled on Perl
33           5.16.0 and higher.
34
35       •   Filehandles are opened with UTF-8 encoding turned on by default
36           (including "STDIN", "STDOUT", and "STDERR" when "utf8::all" is used
37           from the "main" package). Meaning that they automatically convert
38           UTF-8 octets to characters and vice versa. If you don't want UTF-8
39           for a particular filehandle, you'll have to set "binmode
40           $filehandle".
41
42       •   @ARGV gets converted from UTF-8 octets to Unicode characters (when
43           "utf8::all" is used from the "main" package). This is similar to
44           the behaviour of the "-CA" perl command-line switch (see perlrun).
45
46       •   "readdir", "readlink", "readpipe" (including the "qx//" and
47           backtick operators), and "glob" (including the "<>" operator) now
48           all work with and return Unicode characters instead of (UTF-8)
49           octets (again only when "utf8::all" is used from the "main"
50           package).
51
52   Lexical Scope
53       The pragma is lexically-scoped, so you can do the following if you had
54       some reason to:
55
56           {
57               use utf8::all;
58               open my $out, '>', 'outfile';
59               my $utf8_str = 'føø bār';
60               print length $utf8_str, "\n"; # 7
61               print $out $utf8_str;         # out as utf8
62           }
63           open my $in, '<', 'outfile';      # in as raw
64           my $text = do { local $/; <$in>};
65           print length $text, "\n";         # 10, not 7!
66
67       Instead of lexical scoping, you can also use "no utf8::all" to turn off
68       the effects.
69
70       Note that the effect on @ARGV and the "STDIN", "STDOUT", and "STDERR"
71       file handles is always global and can not be undone!
72
73   Enabling/Disabling Global Features
74       As described above, the default behaviour of "utf8::all" is to convert
75       @ARGV and to open the "STDIN", "STDOUT", and "STDERR" file handles with
76       UTF-8 encoding, and override the "readlink" and "readdir" functions and
77       "glob" operators when "utf8::all" is used from the "main" package.
78
79       If you want to disable these features even when "utf8::all" is used
80       from the "main" package, add the option "NO-GLOBAL" (or "LEXICAL-ONLY")
81       to the use line. E.g.:
82
83           use utf8::all 'NO-GLOBAL';
84
85       If on the other hand you want to enable these global effects even when
86       "utf8::all" was used from another package than "main", use the option
87       "GLOBAL" on the use line:
88
89           use utf8::all 'GLOBAL';
90
91   UTF-8 Errors
92       "utf8::all" will handle invalid code points (i.e., utf-8 that does not
93       map to a valid unicode "character"), as a fatal error.
94
95       For "glob", "readdir", and "readlink", one can change this behaviour by
96       setting the attribute "$utf8::all::UTF8_CHECK".
97

ATTRIBUTES

99   $utf8::all::UTF8_CHECK
100       By default "utf8::all" marks decoding errors as fatal (default value
101       for this setting is "Encode::FB_CROAK"). If you want, you can change
102       this by setting $utf8::all::UTF8_CHECK. The value "Encode::FB_WARN"
103       reports the encoding errors as warnings, and "Encode::FB_DEFAULT" will
104       completely ignore them. Please see Encode for details. Note:
105       "Encode::LEAVE_SRC" is always enforced.
106
107       Important: Only controls the handling of decoding errors in "glob",
108       "readdir", and "readlink".
109

INTERACTION WITH AUTODIE

111       If you use autodie, which is a great idea, you need to use at least
112       version 2.12, released on June 26, 2012
113       <https://metacpan.org/source/PJF/autodie-2.12/Changes#L3>.  Otherwise,
114       autodie obliterates the IO layers set by the open pragma. See RT #54777
115       <https://rt.cpan.org/Ticket/Display.html?id=54777> and GH #7
116       <https://github.com/doherty/utf8-all/issues/7>.
117

BUGS

119       Please report any bugs or feature requests on the bugtracker website
120       <https://github.com/doherty/utf8-all/issues>.
121
122       When submitting a bug or request, please include a test-file or a patch
123       to an existing test-file that illustrates the bug or desired feature.
124

COMPATIBILITY

126       The filesystems of Dos, Windows, and OS/2 do not (fully) support UTF-8.
127       The "readlink" and "readdir" functions and "glob" operators will
128       therefore not be replaced on these systems.
129

SEE ALSO

131       •   File::Find::utf8 for fully utf-8 aware File::Find functions.
132
133       •   Cwd::utf8 for fully utf-8 aware Cwd functions.
134

AUTHORS

136       •   Michael Schwern <mschwern@cpan.org>
137
138       •   Mike Doherty <doherty@cpan.org>
139
140       •   Hayo Baan <info@hayobaan.com>
141
143       This software is copyright (c) 2009 by Michael Schwern
144       <mschwern@cpan.org>; he originated it.
145
146       This is free software; you can redistribute it and/or modify it under
147       the same terms as the Perl 5 programming language system itself.
148
149
150
151perl v5.34.0                      2022-01-21                      utf8::all(3)
Impressum