1Locale::TextDomain(3) User Contributed Perl DocumentationLocale::TextDomain(3)
2
3
4

NAME

6       Locale::TextDomain - Perl Interface to Uniforum Message Translation
7

SYNOPSIS

9        use Locale::TextDomain ('my-package', @locale_dirs);
10
11        use Locale::TextDomain qw (my-package);
12
13        my $translated = __"Hello World!\n";
14
15        my $alt = $__{"Hello World!\n"};
16
17        my $alt2 = $__->{"Hello World!\n"};
18
19        my @list = (N__"Hello",        N__"World");
20
21        my @plurals = (N__ ("One world", "{num} worlds"),
22                       N__ ("1 file", "%d files"));
23
24        my $question = __x ("Error reading file '{file}': {err}",
25                            file => $file, err => $!);
26
27        printf (__n ("one file read",
28                     "%d files read",
29                     $num_files),
30                $num_files);
31
32        print __nx ("one file read", "{num} files read", $num_files,
33                    num => $num_files);
34

DESCRIPTION

36       The module Locale::TextDomain(3pm) provides a high-level interface to
37       Perl message translation.
38
39       Textdomains
40
41       When your request a translation for a given string, the system used in
42       libintl-perl follows a standard strategy to find a suitable message
43       catalog containing the translation: Unless you explicitely define a
44       name for the message catalog, libintl-perl will assume that your cata‐
45       log is called 'messages' (unless you have changed the default value to
46       something else via Locale::Messages(3pm), method textdomain()).
47
48       You might think that his default strategy leaves room for optimization
49       and you are right.  It would be a lot smarter if multiple software
50       packages, all with their individual message catalogs, could be
51       installed on one system, and it should also be possible that third-
52       party components of your software (like Perl modules) can load their
53       message catalogs, too, without interfering with yours.
54
55       The solution is clear, you have to assign a unique name to your message
56       database, and you have to specify that name at run-time.  That unique
57       name is the so-called textdomain of your software package.  The name is
58       actually arbitrary but you should follow these best-practice guidelines
59       to ensure maximum interoperability:
60
61       File System Safety
62               In practice, textdomains get mapped into file names, and you
63               should therefore make sure that the textdomain you choose is a
64               valid filename on every system that will run your software.
65
66       Case-sensitivity
67               Textdomains are always case-sensitive (i. e. 'Package' and
68               'PACKAGE' are not the same).  However, since the message cata‐
69               logs will be stored on file systems, that may or may not dis‐
70               tinguish case when looking up file names, you should avoid
71               potential conflicts here.
72
73       Textdomain Should Match CPAN Name
74               If your software is listed as a module on CPAN, you should sim‐
75               ply choose the name on CPANS as your textdomain.  The textdo‐
76               main for libintl-perl is hence 'libintl-perl'.  But please
77               replace all periods ('.') in your package name with an under‐
78               score because ...
79
80       Internet Domain Names as a Fallback
81               ... if your software is not a module listed on CPAN, as a last
82               resort you should use the Java(tm) package scheme, i. e. choose
83               an internet domain that you are owner of (or ask the owner of
84               an internet domain) and concatenate your preferred textdomain
85               with the reversed internet domain.  Example: Your company runs
86               the web-site 'www.foobar.org' and is the owner of the domain
87               'foobar.org'.  The textdomain for your company's software 'bar‐
88               foos' should hence be 'org.foobar.barfoos'.
89
90       If your software is likely to be installed in different versions on the
91       same system, it is probably a good idea to append some version informa‐
92       tion to your textdomain.
93
94       Other systems are less strict with the naming scheme for textdomains
95       but the phenomena known as Perl is actually a plethora of small, spe‐
96       cialized modules and it is probably wisest to postulate some namespace
97       model in order to avoid chaos.
98
99       Binding textdomains to directories
100
101       Once the system knows the textdomain of the message that you want to
102       get translated into the user's language, it still has to find the cor‐
103       rect message catalog.  By default, libintl-perl will look up the string
104       in the translation database found in the directories /usr/share/locale
105       and /usr/local/share/locale (in that order).
106
107       It is neither guaranteed that these directories exist on the target
108       machine, nor can you be sure that the installation routine has write
109       access to these locations.  You can therefore instruct libintl-perl to
110       search other directories prior to the default directories.  Specifying
111       a differnt search directory is called binding a textdomain to a direc‐
112       tory.
113
114       Locale::TextDomain extends the default strategy by a Perl specific
115       approach.  Unless told otherwise, it will look for a directory Locale‐
116       Data in every component found in the standard include path @INC and
117       check for a database containing the message for your textdomain there.
118       Example: If the path /usr/lib/perl/5.8.0/site_perl is in your @INC, you
119       can install your translation files in
120       /usr/lib/perl/5.8.0/site_perl/LocaleData, and they will be found at
121       run-time.
122

USAGE

124       It is crucial to remember that you use Locale::TextDoamin(3) as speci‐
125       fied in the section "SYNOPSIS", that means you have to use it, not
126       require it.  The module behaves quite differently compared to other
127       modules.
128
129       The most significant difference is the meaning of the list passed as an
130       argument to the use() function.  It actually works like this:
131
132           use Locale::TextDomain (TEXTDOMAIN, DIRECTORY, ...)
133
134       The first argument (the first string passed to use()) is the textdomain
135       of your package, optionally followed by a list of directories to search
136       instead of the Perl-specific directories (see above: /LocaleData
137       appended to every part of @INC).
138
139       If you are the author of a package 'barfoos', you will probably put the
140       line
141
142           use Locale::TextDomain 'barfoos';
143
144       resp. for non-CPAN modules
145
146           use Locale::TextDomain 'org.foobar.barfoos';
147
148       in every module of your package that contains translatable strings. If
149       your module has been installed properly, including the message cata‐
150       logs, it will then be able to retrieve these translations at run-time.
151
152       If you have not installed the translation database in a directory
153       LocaleData in the standard include path @INC (or in the system directo‐
154       ries /usr/share/locale resp. /usr/local/share/locale), you have to
155       explicitely specify a search path by giving the names of directories
156       (as strings!) as additional arguments to use():
157
158           use Locale::TextDomain qw (barfoos ./dir1 ./dir2);
159
160       Alternatively you can call the function bindtextdomain() with suitable
161       arguments (see the entry for bindtextdomain() in "FUNCTIONS" in
162       Locale::Messages).  If you do so, you should pass "undef" as an addi‐
163       tional argument in order to avoid unnecessary lookups:
164
165           use Locale::TextDomain ('barfoos', undef);
166
167       You see that the arguments given to use() have nothing to do with what
168       is imported into your namespace, but they are rather arguments to
169       textdomain(), resp. bindtextdomain().  Does that mean that
170       Locale::TextDomain exports nothing into your namespace? Umh, not
171       exactly ... in fact it imports all functions listed below into your
172       namespace, and hence you should not define conflicting functions (and
173       variables) yourself.
174
175       So, why has Locale::TextDomain to be different from other modules?  If
176       you have ever written software in C and prepared it for international‐
177       ization (i18n), you will probably have defined some preprocessor macros
178       like:
179
180           #define _(String) dgettext ("my-textdomain", String)
181           #define N_(String) String
182
183       You only have to define that once in C, and the textdomain for your
184       package is automatically inserted into all gettext functions.  In Perl
185       there is no such mechanism (at least it is not portable, option -P) and
186       using the gettext functions could become quite cumbersome without some
187       extra fiddling:
188
189           print dgettext ("my-textdomain", "Hello world!\n");
190
191       This is no fun.  In C it would merely be a
192
193           printf (_("Hello world!\n"));
194
195       Perl has to be more concise and shorter than C ... see the next section
196       for how you can use Locale::TextDomain to end up in Perl with a mere
197
198           print __"Hello World!\n";
199

EXPORTED FUNCTIONS

201       All functions have quite funny names on purpose.  In fact the purpose
202       for that is quite clear: They should be short, operator-like, and they
203       should not yell for conflicts with existing functions in your names‐
204       pace.  You will understand it, when you internationalize your first
205       Perl program or module.  Preparing it is more like marking strings as
206       being translatable than inserting function calls.  Here we go:
207
208       __ MSGID
209           NOTE: This is a double underscore!
210
211           The basic and most-used function.  It is a short-cut for a call to
212           gettext() resp. dgettext(), and simply returns the translation for
213           MSGID.  If your old code reads like this:
214
215               print "permission denied";
216
217           You will now write:
218
219               print __"permission denied";
220
221           That's all, the string will be output in the user's preferred lan‐
222           guage, provided that you have installed a translation for it.
223
224           Of course you can also use parentheses:
225
226               print __("permission denied");
227
228           Or even:
229
230               print (__("permission denied"));
231
232           In my eyes, the first version without parentheses looks best.
233
234       __x MSGID, ID1 => VAL1, ID2 => VAL2, ...
235           One of the nicest features in Perl is its capability to interpolate
236           variables into strings:
237
238               print "This is the $color $thing.\n";
239
240           This nice feature might con you into thinking that you could now
241           write
242
243               print __"This is the $color $thing.\n";
244
245           Alas, that would be nice, but it is not possible.  Remember that
246           the function __() serves both as an operator for translating
247           strings and as a mark for translatable strings.  If the above
248           string would get extracted from your Perl code, the un-interpolated
249           form would end up in the message catalog because when parsing your
250           code it is unpredictable what values the variables $thing and
251           $color will have at run-time (this fact is most probably one of the
252           reasons you have written your program for).
253
254           However, at run-time, Perl will have interpolated the values
255           already before __() (resp. the underlying gettext() function) has
256           seen the original string.  Consequently something like "This is the
257           red car.\n" will be looked up in the message catalog, it will not
258           be found (because only "This is the $color $thing.\n" is included
259           in the database), and the original, untranslated string will be
260           returned.  Honestly, because this is almost always an error, the
261           xgettext(1) program will bail out with a fatal error when it comes
262           across that string in your code.
263
264           There are two workarounds for that:
265
266               printf __"This is the %s %s.\n", $color, $thing;
267
268           But that has several disadvantages: Your translator will only see
269           the isolated string, and without the surrounding code it is almost
270           impossible to interpret it correctly.  Of course, GNU emacs and
271           other software capable of editing PO translation files will allow
272           you to examine the context in the source code, but it is more
273           likely that your translator will look for a less challenging trans‐
274           lation project when she frequently comes across such messages.
275
276           And even if she does understand the underlying programming, what if
277           she has to reorder the color and the thing like in French:
278
279               msgid "This is the red car.\n";
280               msgstr "Cela est la voiture rouge.\n"
281
282           Zut alors! No way! You cannot portably reorder the arguments to
283           printf() and friends in Perl (it is possible in C, but at the time
284           of this writing not supported in Perl, and it would lead to other
285           problems anyway).
286
287           So what? The Perl backend to GNU gettext has defined an alternative
288           format for interpolatable strings:
289
290               "This is the {color} {thing}.\n";
291
292           Instead of Perl variables you use place-holders (legal Perl vari‐
293           ables are also legal place-holders) in angle brackets, and then you
294           call
295
296               print __x ("This is the {color} {thing}.\n",
297                          thing => $thang,
298                          color => $color);
299
300           The function __x() will take the additional hash and replace all
301           occurencies of the hash keys in angle brackets with the correspond‐
302           ing values.  Simple, readable, understandable to translators, what
303           else would you want?  And if the translator forgets, misspells or
304           otherwise messes up some "variables", the msgfmt(1) program, that
305           is used to compile the textual translation file into its binary
306           representation will even choke on these errors and refuse to com‐
307           pile the translation.
308
309       __n MSGID, MSGID_PLURAL, COUNT
310           Whew! That looks complicated ... It is best explained with an exam‐
311           ple.  We'll have another look at your vintage code:
312
313               if ($files_deleted > 1) {
314                   print "All files have been deleted.\n";
315               } else {
316                   print "One file has been deleted.\n";
317               }
318
319           Your intent is clear, you wanted to avoid the cumbersome "1 files
320           deleted".  This is okay for English, but other languages have more
321           than one plural form.  For example in Russian it makes a difference
322           whether you want to say 1 file, 3 files or 6 files.  You will use
323           three different forms of the noun 'file' in each case.  [Note: Yep,
324           very smart you are, the Russian word for 'file' is in fact the Eng‐
325           lish word, and it is an invariable noun, but if you know that, you
326           will also understand the rest despite this little simplification
327           ...].
328
329           That is the reason for the existance of the function ngettext(),
330           that __n() is a short-cut for:
331
332               print __n"One file has been deleted.\n",
333                        "All files have been deleted.\n",
334                        $files_deleted;
335
336           Alternatively:
337
338               print __n ("One file has been deleted.\n",
339                          "All files have been deleted.\n",
340                          $files_deleted);
341
342           The effect is always the same: libintl-perl will find out which
343           plural form to pick for your user's language, and the output string
344           will always look okay.
345
346       __nx MSGID, MSGID_PLURAL, COUNT, VAR1 => VAL1, VAR2 => VAL2, ...
347           Bringing it all together:
348
349               print __nx ("One file has been deleted.\n",
350                           "{count} files have been deleted.\n",
351                           $num_files,
352                           count => $num_files);
353
354           The function __nx() picks the correct plural form (also for Eng‐
355           lish!)  and it is capable of interpolating variables into strings.
356
357           Have a close look at the order of arguments: The first argument is
358           the string in the singular, the second one is the plural string.
359           The third one is an integer indicating the number of items.  This
360           third argument is only used to pick the correct translation.  The
361           optionally following arguments make up the hash used for interpola‐
362           tion.  In the beginning it is often a little confusing that the
363           variable holding the number of items will usually be repeated some‐
364           where in the interpolation hash.
365
366       __xn MSGID, MSGID_PLURAL, COUNT, VAR1 => VAL1, VAR2 => VAL2, ...
367           Does exactly the same thing as __nx().  In fact it is a common typo
368           promoted to a feature.
369
370       N__ (ARG1, ARG2, ...)
371           A no-op function that simply echoes its arguments to the caller.
372           Take the following piece of Perl:
373
374               my @options = (
375                   "Open",
376                   "Save",
377                   "Save As",
378               );
379
380               ...
381
382               my $option = $options[1];
383
384           Now say that you want to have this translatable.  You could some‐
385           times simply do:
386
387               my @options = (
388                   __"Open",
389                   __"Save",
390                   __"Save As",
391               );
392
393               ...
394
395               my $option = $options[1];
396
397           But often times this will not be what you want, for example when
398           you also need the unmodified original string.  Sometimes it may not
399           even work, for example, when the preferred user language is not yet
400           determined at the time that the list is initialized.
401
402           In these cases you would write:
403
404               my @options = (
405                   N__"Open",
406                   N__"Save",
407                   N__"Save As",
408               );
409
410               ...
411
412               my $option = __($options[1]);
413               # or: my $option = dgettext ('my-domain', $options[1]);
414
415           Now all the strings in @options will be left alone, since N__()
416           returns its arguments (one ore more) unmodified.  Nevertheless, the
417           string extractor will be able to recognize the strings as being
418           translatable.  And you can still get the translation later by pass‐
419           ing the variable instead of the string.
420
421       N__n (ARG1, ...)
422           Does exactly the same as N__().  You will use this form if you have
423           to mark the strings as having plural forms.
424

EXPORTED VARIABLES

426       The module exports several variables into your namespace:
427
428       %__ A tied hash.  Its keys are your original messages, the values are
429           their translations:
430
431               my $title = "<h1>$__{'My Homepage'}</h1>";
432
433           This is much better for your translation team than
434
435               my $title = __"<h1>My Homepage</h1>";
436
437           In the second case the HTML code will make it into the translation
438           database and your translators have to be aware of HTML syntax when
439           translating strings.
440
441       $__ A reference to "%__", in case you prefer:
442
443                my $title = "<h1>$__->{'My Homepage'}</h1>";
444

PERFORMANCE

446       Message translation can be a time-consuming task.  Take this little
447       example:
448
449           1: use Locale::TextDomain ('my-domain');
450           2: use POSIX (:locale_h);
451           3:
452           4: setlocale (LC_ALL, '');
453           5: print __"Hello world!\n";
454
455       This will usually be quite fast, but in pathological cases it may run
456       for several seconds.  A worst-case scenario would look be a Chinese
457       user at a terminal that understands the codeset Big5-HKSCS.  Your
458       translator for Chinese has however chosen to encode the translations in
459       the codeset EUC-TW.
460
461       What will happen at run-time?  First, the library will search and load
462       a (maybe large) message catalog for your textdomain 'my-domain'.  Then
463       it will look up the translation for "Hello world!\n", it will find that
464       it is encoded in EUC-TW.  Since that differs from the output codeset
465       Big5-HKSCS, it will first load a conversion table containing several
466       ten-thousands of codepoints for EUC-TW, then it does the same with the
467       smaller, but still very large conversion table for Big5-HKSCS, it will
468       convert the translation on the fly from EUC-TW into Big5-HKSCS, and
469       finally it will return the converted translation.
470
471       A worst-case scenario but realistic.  And for these five lines of
472       codes, there is not much you can do to make it any faster.  You should
473       understand, however, when the different steps will take place, so that
474       you can arrange your code for it.
475
476       You have learned in the section "DESCRIPTION" that line 1 is responsi‐
477       ble for locating your message database.  However, the use() will do
478       nothing more than remembering your settings.  It will not search any
479       directories, it will not load any catalogs or conversion tables.
480
481       Somewhere in your code you will always have a call to POSIX::setlo‐
482       cale(), and the performance of this call may be time-consuming, depend‐
483       ing on the architecture of your system.  On some systems, this will
484       consume very little time, on others it will only consume a considerable
485       amount of time for the first call, and on others it may always be
486       time-consuming.  Since you cannot know, how setlocale() is implemented
487       on the target system, you should reduce the calls to setlocale() to a
488       minimum.
489
490       Line 5 requests the translation for your string.  Only now, the library
491       will actually load the message catalog, and only now will it load even‐
492       tually needed conversion tables.  And from now on, all this information
493       will be cached in memory.  This strategy is used throughout lib‐
494       intl-perl, and you may describe it as 'load-on-first-access'.  Getting
495       the next translation will consume very little resources.
496
497       However, although the translation retrieval is somewhat obfuscated by
498       an operator-like function call, it is still a function call, and in
499       fact it even involves a chain of function calls.  Consequently, the
500       following example is probably bad practice:
501
502           foreach (1 .. 100_000) {
503               print __"Hello world!\n";
504           }
505
506       This example introduces a lot of overhead into your program.  Better do
507       this:
508
509           my $string = __"Hello world!\n";
510           foreach (1 .. 100_000) {
511               print $string;
512           }
513
514       The translation will never change, there is no need to retrieve it over
515       and over again.  Although libintl-perl will of course cache the trans‐
516       lation read from the file system, you can still avoid the overhead for
517       the function calls.
518

AUTHOR

520       Copyright (C) 2002-2004, Guido Flohr <guido@imperia.net>, all rights
521       reserved.  See the source code for details.
522
523       This software is contributed to the Perl community by Imperia
524       (<http://www.imperia.net/>).
525

SEE ALSO

527       Locale::Messages(3pm), Locale::gettext_pp(3pm), perl(1), gettext(1),
528       gettext(3)
529
530
531
532perl v5.8.8                       2006-08-28             Locale::TextDomain(3)
Impressum