1Locale::libintlFAQ(3) User Contributed Perl DocumentationLocale::libintlFAQ(3)
2
3
4

NAME

6       Locale::TextDomain::FAQ - Frequently asked questions for libintl-perl
7

DESCRIPTION

9       This FAQ
10

QUESTIONS AND ANSWERS

12   Why is libintl-perl so big?  Why don't you use Encode(3pm) for character
13       set conversion instead of rolling your own version?
14       Encode(3pm) requires at least Perl 5.7.x, whereas libintl-perl needs to
15       be operational on Perl 5.004.  Internally, libintl-perl uses
16       Encode(3pm) if it is available.
17
18   Why do the gettext functions always unset the utf-8 flag on the strings it
19       returns?
20       Because the gettext functions do not know whether the string is encoded
21       in utf-8 or not.  Instead of taking guesses, it rather unsets the flag.
22
23   Can I set the utf-8 flag on strings returned by the gettext family of
24       functions?
25       Yes, but it is not recommended.  If you absolutely want to do it, use
26       the function bind_textdomain_filter in Locale::Messages for it.
27
28       The strings returned by gettext and friends are by default encoded in
29       the preferred charset for the user's locale, but there is no portable
30       way to find out, whether this is utf-8 or not.  That means, you either
31       have to enforce utf-8 as the output character set (by means of
32       bind_textdomain_codeset() and/or the environment variable
33       OUTPUT_CHARSET) and override the user preference, or you run the risk
34       of marking strings as utf-8 which really aren't utf-8.
35
36       The whole concept behind that utf-8 flag introduced in Perl 5.6 is
37       seriously broken, and the above described dilemma is a proof for that.
38       The best thing you can do with that flag is get rid of it, and turn it
39       off.  Your code will benefit from it and become less error prone, more
40       portable and faster.
41
42   Why do non-ASCII characters in my Gtk2 application look messed up?
43       The Perl binding of Gtk2 has a design flaw.  It expects all UI messages
44       to be in UTF-8 and it also expects messages to be flagged as utf-8.
45       The only solution for you is to enforce all your po files to be encoded
46       in utf-8 (convert them manually, if you need to), and also enforce that
47       charset in your application, regardless of the user's locale settings.
48       Assumed that your textdomain is "org.bar.foo", you have to code the
49       following into your main module or script:
50
51         BEGIN {
52             bind_textdomain_filter 'org.bar.foo', \&turn_utf_8_on;
53             bind_textdomain_codeset 'org.bar.foo', 'utf-8';
54         }
55
56       See the File GTestRunner.pm of Test::Unit::GTestRunner(3pm) for
57       details.
58
59   How do I interface Glade2 UI definitions with libintl-perl?
60       Gtk2::GladeXML(3pm) seems to ignore calls to bind_textdomain().  See
61       the File GTestRunner.pm of Test::Unit::GTestRunner(3pm) for a possible
62       solution.
63
64   Why does Locale::TextDomain use a double underscore?  I am used to a single
65       underscore from C or other languages.
66       Function names that consist of exactly one non-alphanumerical character
67       make the function automatically global in Perl.  Besides, in Perl 6 the
68       concatenation operator will be the underscore instead of the dot.
69
70   How do I switch languages or force a certain language independently from
71       user settings read from the environment?
72       The simple answer is:
73
74           use POSIX qw (setlocale LC_ALL);
75
76           my $language = 'fr';
77           my $country = 'FR';
78           my $charset = 'iso-8859-1';
79
80           setlocale LC_ALL, "${language}_$country.$charset";
81
82       Sadly enough, this will fail in many cases.  The problem is that locale
83       identifiers are not standardized and are completely system-dependent.
84       Not only their overall format, but also other details like case-
85       sensitivity.  Some systems are very forgiving about the system - for
86       example normalizing charset descriptions - others very strict.  In
87       order to be reasonably platform independent, you should try a list of
88       possible locale identifiers for your desired settings.  This is about
89       what I would try for achieving the above:
90
91          my @tries = qw (
92               fr_FR.iso-8859-1 fr_FR.iso8859-1 fr_FR.iso88591
93               fr_FR.ISO-8859-1 fr_FR.ISO8859-1 fr_FR.ISO88591
94               fr.iso-8859-1 fr.iso8859-1 fr.iso88591
95               fr.ISO-8859-1 fr.ISO8859-1 fr.ISO88591
96               fr_FR
97               French_France.iso-8859-1 French_France.iso8859-1 French_France.iso88591
98               French_France.ISO-8859-1 French_France.ISO8859-1 French_France.ISO88591
99               French.iso-8859-1 French.iso8859-1 French.iso88591
100               French.ISO-8859-1 French.ISO8859-1 French.ISO88591
101          );
102          foreach my $try (@tries) {
103               last if setlocale LC_ALL, $try;
104          }
105
106       Set Locale::Util(3pm) for functions that help you with this.
107
108       Alternatively, you can force a certain language by setting the
109       environment variables LANGUAGE, LANG and OUTPUT_CHARSET, but this is
110       only guaranteed to work, if you use the pure Perl implementation of
111       gettext (see the documentation for select_package() in
112       Locale::Messages(3pm)). You would do the above like this:
113
114           use Locale::Messages qw (nl_putenv);
115
116           # LANGUAGE is a colon separated list of languages.
117           nl_putenv("LANGUAGE=fr_FR");
118
119           # If LANGUAGE is set, LANG should be set to the primary language.
120           # This is not needed for gettext, but for other parts of the system
121           # it is.
122           nl_putenv("LANG=fr_FR");
123
124           # Force an output charset like this:
125           nl_putenv("OUTPUT_CHARSET=iso-8859-1");
126
127           setlocale (LC_MESSAGES, 'C');
128
129       These environment variables are GNU extensions, and they are also
130       honored by libintl-perl.  Still, you should always try to set the
131       locale with setlocale for the catch-all category LC_ALL.  If you miss
132       to do so, your program's output maybe cluttered, mixing languages and
133       charsets, if the system runs in a locale that is not compatible with
134       your own language settings.
135
136       Remember that these environment variables are not guaranteed to work,
137       if you use an XS version of gettext.  In order to force usage of the
138       pure Perl implementation, do the following:
139
140           Locale::Messages->select_package ('gettext_pp');
141
142       If you think, this is brain-damaged, you are right, but I cannot help
143       you.  Actually there should be a more flexible API than setlocale, but
144       at the time of this writing there isn't.  Until then, the
145       recommentation goes like this:
146
147               1) Try setting LC_ALL with Locale::Util.
148               2) If that does not succeed, either give up or ...
149               3) Reset LC_MESSAGES to C/POSIX.
150               4) Switch to pure Perl for gettext.
151               5) Set the environment variables LANGUAGE, LANG,
152                  and OUTPUT_CHARSET to your desired values.
153
154   What is the advantage of libintl-perl over Locale::Maketext?
155       Of course, I can only give my personal opinion as an answer.
156
157       Locale::Maketext claims to fix design flaws in gettext.  These alleged
158       design flaws, however, boil down to one pathological case which always
159       has a workaround.  But both programmers and translators pay this fix
160       with an unnecessarily complicated interface.
161
162       The paramount advantage of libintl-perl is that it uses an approved
163       technology and concept.  Except for Java(tm) programs, this is the
164       state-of-the-art concept for localizing Un*x software.  Programmers
165       that have already localized software in C, C++, C#, Python, PHP, or a
166       number of other languages will feel instantly at home, when localizing
167       software written in Perl with libintl-perl.  The same holds true for
168       the translators, because the files they deal with have exactly the same
169       format as those for other programming languages.  They can use the same
170       set of tools, and even the commands they have to execute are the same.
171
172       With libintl-perl refactoring of the software is painless, even if you
173       modify, add or delete translatable strings.  The gettext tools are
174       powerful enough to reduce the effort of the translators to the bare
175       minimum.  Maintaining the message catalogs of Locale::Maketext in
176       larger scale projects, is IMHO unfeasible.
177
178       Editing the message catalogs of Locale::Maketext - they are really Perl
179       modules - asks too much from most translators, unless they are
180       programmers.  The portable object (po) files used by libintl-perl have
181       a simple syntax, and there are a bunch of specialized GUI editors for
182       these files, that facilitate the translation process and hide most
183       complexity from the user.
184
185       Furthermore, libintl-perl makes it possible to mix programming
186       languages without a paradigm shift in localization.  Without any
187       special efforts, you can write a localized software that has modules
188       written in C, modules in Perl, and builds a Gtk user interface with
189       Glade.  All translatable strings end up in one single message catalog.
190
191       Last but not least, the interface used by libintl-perl is plain simple:
192       Prepend translatable strings with a double underscore, and you are done
193       in most cases.
194
195   Why do single-quoted strings not work?
196       You probably write something like this:
197
198           print __'Hello';
199
200       And you get an error message like "Can't find string terminator "'"
201       anywhere before EOF at ...", or even "Bareword found where operator
202       expected at ... Might be a runaway multi-line '' string starting on".
203       The above line is (really!) essentially the same as writing:
204
205           print __::Hello';
206
207       A lesser know feature of Perl is that you can use a single quote ("'")
208       as the separator in packages instead of the double colon (":").  What
209       the Perl parser sees in the first example is a valid package name
210       ("__") followed by the separator ("'"), then another valid package name
211       ("Hello") followed by a lone single quote.  It is therefore not a
212       problem in libintl-perl but simple wrong Perl syntax.  You have to
213       correct alternatives:
214
215           print __ 'Hello';   # Insert a space to disambiguate.
216
217       Or use double-quotes:
218
219           print __"Hello";
220
221       Thanks to Slavi Agafonkin for pointing me to the solution of this
222       mystery.
223
224   What options should be used with xgettext?
225       More precise, the question should be which '--keyword' and '--flag'
226       options for xgettext should be used.  All other options are completely
227       dependent on your use-case.
228
229       If you are using Locale::Messages or Locale::Gettext for localizing
230       Perl code, the default keywords and default flags built into xgettext
231       are correct.
232
233       If you are using Locale::TextDomain you have to use a long plethora of
234       command-line options for xgettext.  Beginning with libintl-perl 1.28
235       you can use the library itself to produce these options:
236
237           perl -MLocale::TextDomain -e 'print Locale::TextDomain->options'
238
239       If you want to disable the use of the built-in default keywords,
240       precede the output of the above command with '--keyword=""'.  That will
241       reset the keywords for xgettext.
242
243   Why Isn't There A Function N__x(), N__nx(), or N__px()?
244       The sole purpose of these functions would be to set proper flags in the
245       output of xgettext(1).  You probably thought of something like this:
246
247           xgettext --keyword=N__x --flag=N__x:1:perl-brace-format filename.pl
248
249       First of all, xgettext(1) will always set the flag correctly if the
250       argument to N__() looks like a brace format string.
251
252       Second, you can set any flag you want on the PO entry with a source
253       code comment:
254
255          # xgettext: no-perl-brace-format
256          my $msg = N__("Placeholders are enclosed in {curly} braces.");
257
258       When xgettext(1) extracts the string, it will appear like this in the
259       .pot file:
260
261           #: filename.pl:2304
262           #, no-perl-brace-format
263           msgid "Placeholders are enclosed in {curly} braces."
264           msgstr ""
265
266       No reason to pollute the namespace with N__x functions.
267
268
269
270perl v5.36.0                      2022-07-22             Locale::libintlFAQ(3)
Impressum