1Locale::TextDomain(3) User Contributed Perl DocumentationLocale::TextDomain(3)
2
3
4

NAME

6       Locale::TextDomain - Perl Interface to Uniforum Message Translation
7

SYNOPSIS

9        use Locale::TextDomain ('my-package', @locale_dirs);
10
11        use Locale::TextDomain qw (my-package);
12
13        my $translated = __"Hello World!\n";
14
15        my $alt = $__{"Hello World!\n"};
16
17        my $alt2 = $__->{"Hello World!\n"};
18
19        my @list = (N__"Hello",
20                    N__"World");
21
22        printf (__n ("one file read",
23                     "%d files read",
24                     $num_files),
25                $num_files);
26
27        print __nx ("one file read", "{num} files read", $num_files,
28                    num => $num_files);
29
30        my $translated_context = __p ("Verb, to view", "View");
31
32        printf (__np ("Files read from filesystems",
33                      "one file read",
34                      "%d files read",
35                      $num_files),
36                $num_files);
37
38        print __npx ("Files read from filesystems",
39                     "one file read",
40                     "{num} files read",
41                     $num_files,
42                     num => $num_files);
43

DESCRIPTION

45       The module Locale::TextDomain(3pm) provides a high-level interface to
46       Perl message translation.
47
48   Textdomains
49       When you request a translation for a given string, the system used in
50       libintl-perl follows a standard strategy to find a suitable message
51       catalog containing the translation: Unless you explicitely define a
52       name for the message catalog, libintl-perl will assume that your
53       catalog is called 'messages' (unless you have changed the default value
54       to something else via Locale::Messages(3pm), method textdomain()).
55
56       You might think that his default strategy leaves room for optimization
57       and you are right.  It would be a lot smarter if multiple software
58       packages, all with their individual message catalogs, could be
59       installed on one system, and it should also be possible that third-
60       party components of your software (like Perl modules) can load their
61       message catalogs, too, without interfering with yours.
62
63       The solution is clear, you have to assign a unique name to your message
64       database, and you have to specify that name at run-time.  That unique
65       name is the so-called textdomain of your software package.  The name is
66       actually arbitrary but you should follow these best-practice guidelines
67       to ensure maximum interoperability:
68
69       File System Safety
70               In practice, textdomains get mapped into file names, and you
71               should therefore make sure that the textdomain you choose is a
72               valid filename on every system that will run your software.
73
74       Case-sensitivity
75               Textdomains are always case-sensitive (i. e. 'Package' and
76               'PACKAGE' are not the same).  However, since the message
77               catalogs will be stored on file systems, that may or may not
78               distinguish case when looking up file names, you should avoid
79               potential conflicts here.
80
81       Textdomain Should Match CPAN Name
82               If your software is listed as a module on CPAN, you should
83               simply choose the name on CPAN as your textdomain.  The
84               textdomain for libintl-perl is hence 'libintl-perl'.  But
85               please replace all periods ('.') in your package name with an
86               underscore because ...
87
88       Internet Domain Names as a Fallback
89               ... if your software is not a module listed on CPAN, as a last
90               resort you should use the Java(tm) package scheme, i. e. choose
91               an internet domain that you are owner of (or ask the owner of
92               an internet domain) and concatenate your preferred textdomain
93               with the reversed internet domain.  Example: Your company runs
94               the web-site 'www.foobar.org' and is the owner of the domain
95               'foobar.org'.  The textdomain for your company's software
96               'barfoos' should hence be 'org.foobar.barfoos'.
97
98       If your software is likely to be installed in different versions on the
99       same system, it is probably a good idea to append some version
100       information to your textdomain.
101
102       Other systems are less strict with the naming scheme for textdomains
103       but the phenomena known as Perl is actually a plethora of small,
104       specialized modules and it is probably wisest to postulate some
105       namespace model in order to avoid chaos.
106
107   Binding textdomains to directories
108       Once the system knows the textdomain of the message that you want to
109       get translated into the user's language, it still has to find the
110       correct message catalog.  By default, libintl-perl will look up the
111       string in the translation database found in the directories
112       /usr/share/locale and /usr/local/share/locale (in that order).
113
114       It is neither guaranteed that these directories exist on the target
115       machine, nor can you be sure that the installation routine has write
116       access to these locations.  You can therefore instruct libintl-perl to
117       search other directories prior to the default directories.  Specifying
118       a differnt search directory is called binding a textdomain to a
119       directory.
120
121       Locale::TextDomain extends the default strategy by a Perl specific
122       approach.  Unless told otherwise, it will look for a directory
123       LocaleData in every component found in the standard include path @INC
124       and check for a database containing the message for your textdomain
125       there.  Example: If the path /usr/lib/perl/5.8.0/site_perl is in your
126       @INC, you can install your translation files in
127       /usr/lib/perl/5.8.0/site_perl/LocaleData, and they will be found at
128       run-time.
129

USAGE

131       It is crucial to remember that you use Locale::TextDoamin(3) as
132       specified in the section "SYNOPSIS", that means you have to use it, not
133       require it.  The module behaves quite differently compared to other
134       modules.
135
136       The most significant difference is the meaning of the list passed as an
137       argument to the use() function.  It actually works like this:
138
139           use Locale::TextDomain (TEXTDOMAIN, DIRECTORY, ...)
140
141       The first argument (the first string passed to use()) is the textdomain
142       of your package, optionally followed by a list of directories to search
143       instead of the Perl-specific directories (see above: /LocaleData
144       appended to every part of @INC).
145
146       If you are the author of a package 'barfoos', you will probably put the
147       line
148
149           use Locale::TextDomain 'barfoos';
150
151       resp. for non-CPAN modules
152
153           use Locale::TextDomain 'org.foobar.barfoos';
154
155       in every module of your package that contains translatable strings. If
156       your module has been installed properly, including the message
157       catalogs, it will then be able to retrieve these translations at run-
158       time.
159
160       If you have not installed the translation database in a directory
161       LocaleData in the standard include path @INC (or in the system
162       directories /usr/share/locale resp. /usr/local/share/locale), you have
163       to explicitely specify a search path by giving the names of directories
164       (as strings!) as additional arguments to use():
165
166           use Locale::TextDomain qw (barfoos ./dir1 ./dir2);
167
168       Alternatively you can call the function bindtextdomain() with suitable
169       arguments (see the entry for bindtextdomain() in "FUNCTIONS" in
170       Locale::Messages).  If you do so, you should pass "undef" as an
171       additional argument in order to avoid unnecessary lookups:
172
173           use Locale::TextDomain ('barfoos', undef);
174
175       You see that the arguments given to use() have nothing to do with what
176       is imported into your namespace, but they are rather arguments to
177       textdomain(), resp. bindtextdomain().  Does that mean that
178       Locale::TextDomain exports nothing into your namespace? Umh, not
179       exactly ... in fact it imports all functions listed below into your
180       namespace, and hence you should not define conflicting functions (and
181       variables) yourself.
182
183       So, why has Locale::TextDomain to be different from other modules?  If
184       you have ever written software in C and prepared it for
185       internationalization (i18n), you will probably have defined some
186       preprocessor macros like:
187
188           #define _(String) dgettext ("my-textdomain", String)
189           #define N_(String) String
190
191       You only have to define that once in C, and the textdomain for your
192       package is automatically inserted into all gettext functions.  In Perl
193       there is no such mechanism (at least it is not portable, option -P) and
194       using the gettext functions could become quite cumbersome without some
195       extra fiddling:
196
197           print dgettext ("my-textdomain", "Hello world!\n");
198
199       This is no fun.  In C it would merely be a
200
201           printf (_("Hello world!\n"));
202
203       Perl has to be more concise and shorter than C ... see the next section
204       for how you can use Locale::TextDomain to end up in Perl with a mere
205
206           print __"Hello World!\n";
207

EXPORTED FUNCTIONS

209       All functions have quite funny names on purpose.  In fact the purpose
210       for that is quite clear: They should be short, operator-like, and they
211       should not yell for conflicts with existing functions in your
212       namespace.  You will understand it, when you internationalize your
213       first Perl program or module.  Preparing it is more like marking
214       strings as being translatable than inserting function calls.  Here we
215       go:
216
217       __ MSGID
218           NOTE: This is a double underscore!
219
220           The basic and most-used function.  It is a short-cut for a call to
221           gettext() resp. dgettext(), and simply returns the translation for
222           MSGID.  If your old code reads like this:
223
224               print "permission denied";
225
226           You will now write:
227
228               print __"permission denied";
229
230           That's all, the string will be output in the user's preferred
231           language, provided that you have installed a translation for it.
232
233           Of course you can also use parentheses:
234
235               print __("permission denied");
236
237           Or even:
238
239               print (__("permission denied"));
240
241           In my eyes, the first version without parentheses looks best.
242
243       __x MSGID, ID1 => VAL1, ID2 => VAL2, ...
244           One of the nicest features in Perl is its capability to interpolate
245           variables into strings:
246
247               print "This is the $color $thing.\n";
248
249           This nice feature might con you into thinking that you could now
250           write
251
252               print __"This is the $color $thing.\n";
253
254           Alas, that would be nice, but it is not possible.  Remember that
255           the function __() serves both as an operator for translating
256           strings and as a mark for translatable strings.  If the above
257           string would get extracted from your Perl code, the un-interpolated
258           form would end up in the message catalog because when parsing your
259           code it is unpredictable what values the variables $thing and
260           $color will have at run-time (this fact is most probably one of the
261           reasons you have written your program for).
262
263           However, at run-time, Perl will have interpolated the values
264           already before __() (resp. the underlying gettext() function) has
265           seen the original string.  Consequently something like "This is the
266           red car.\n" will be looked up in the message catalog, it will not
267           be found (because only "This is the $color $thing.\n" is included
268           in the database), and the original, untranslated string will be
269           returned.  Honestly, because this is almost always an error, the
270           xgettext(1) program will bail out with a fatal error when it comes
271           across that string in your code.
272
273           There are two workarounds for that:
274
275               printf __"This is the %s %s.\n", $color, $thing;
276
277           But that has several disadvantages: Your translator will only see
278           the isolated string, and without the surrounding code it is almost
279           impossible to interpret it correctly.  Of course, GNU emacs and
280           other software capable of editing PO translation files will allow
281           you to examine the context in the source code, but it is more
282           likely that your translator will look for a less challenging
283           translation project when she frequently comes across such messages.
284
285           And even if she does understand the underlying programming, what if
286           she has to reorder the color and the thing like in French:
287
288               msgid "This is the red car.\n";
289               msgstr "Cela est la voiture rouge.\n"
290
291           Zut alors! No way! You cannot portably reorder the arguments to
292           printf() and friends in Perl (it is possible in C, but at the time
293           of this writing not supported in Perl, and it would lead to other
294           problems anyway).
295
296           So what? The Perl backend to GNU gettext has defined an alternative
297           format for interpolatable strings:
298
299               "This is the {color} {thing}.\n";
300
301           Instead of Perl variables you use place-holders (legal Perl
302           variables are also legal place-holders) in curly braces, and then
303           you call
304
305               print __x ("This is the {color} {thing}.\n",
306                          thing => $thang,
307                          color => $color);
308
309           The function __x() will take the additional hash and replace all
310           occurencies of the hash keys in curly braces with the corresponding
311           values.  Simple, readable, understandable to translators, what else
312           would you want?  And if the translator forgets, misspells or
313           otherwise messes up some "variables", the msgfmt(1) program, that
314           is used to compile the textual translation file into its binary
315           representation will even choke on these errors and refuse to
316           compile the translation.
317
318       __n MSGID, MSGID_PLURAL, COUNT
319           Whew! That looks complicated ... It is best explained with an
320           example.  We'll have another look at your vintage code:
321
322               if ($files_deleted > 1) {
323                   print "All files have been deleted.\n";
324               } else {
325                   print "One file has been deleted.\n";
326               }
327
328           Your intent is clear, you wanted to avoid the cumbersome "1 files
329           deleted".  This is okay for English, but other languages have more
330           than one plural form.  For example in Russian it makes a difference
331           whether you want to say 1 file, 3 files or 6 files.  You will use
332           three different forms of the noun 'file' in each case.  [Note: Yep,
333           very smart you are, the Russian word for 'file' is in fact the
334           English word, and it is an invariable noun, but if you know that,
335           you will also understand the rest despite this little
336           simplification ...].
337
338           That is the reason for the existance of the function ngettext(),
339           that __n() is a short-cut for:
340
341               print __n"One file has been deleted.\n",
342                        "All files have been deleted.\n",
343                        $files_deleted;
344
345           Alternatively:
346
347               print __n ("One file has been deleted.\n",
348                          "All files have been deleted.\n",
349                          $files_deleted);
350
351           The effect is always the same: libintl-perl will find out which
352           plural form to pick for your user's language, and the output string
353           will always look okay.
354
355       __nx MSGID, MSGID_PLURAL, COUNT, VAR1 => VAL1, VAR2 => VAL2, ...
356           Bringing it all together:
357
358               print __nx ("One file has been deleted.\n",
359                           "{count} files have been deleted.\n",
360                           $num_files,
361                           count => $num_files);
362
363           The function __nx() picks the correct plural form (also for
364           English!)  and it is capable of interpolating variables into
365           strings.
366
367           Have a close look at the order of arguments: The first argument is
368           the string in the singular, the second one is the plural string.
369           The third one is an integer indicating the number of items.  This
370           third argument is only used to pick the correct translation.  The
371           optionally following arguments make up the hash used for
372           interpolation.  In the beginning it is often a little confusing
373           that the variable holding the number of items will usually be
374           repeated somewhere in the interpolation hash.
375
376       __xn MSGID, MSGID_PLURAL, COUNT, VAR1 => VAL1, VAR2 => VAL2, ...
377           Does exactly the same thing as __nx().  In fact it is a common typo
378           promoted to a feature.
379
380       __p MSGCTXT, MSGID
381           This is much like __. The "p" stands for "particular", and the
382           MSGCTXT is used to provide context to the translator. This may be
383           neccessary when your string is short, and could stand for multiple
384           things. For example:
385
386               print __p"Verb, to view", "View";
387               print __p"Noun, a view", "View";
388
389           The above may be "View" entries in a menu, where View->Source and
390           File->View are different forms of "View", and likely need to be
391           translated differently.
392
393           A typical usage are GUI programs.  Imagine a program with a main
394           menu and the notorious "Open" entry in the "File" menu.  Now
395           imagine, there is another menu entry Preferences->Advanced->Policy
396           where you have a choice between the alternatives "Open" and
397           "Closed".  In English, "Open" is the adequate text at both places.
398           In other languages, it is very likely that you need two different
399           translations.  Therefore, you would now write:
400
401               __p"File|", "Open";
402               __p"Preferences|Advanced|Policy", "Open";
403
404           In English, or if no translation can be found, the second argument
405           (MSGID) is returned.
406
407           This function was introduced in libintl-perl 1.17.
408
409       __px MSGCTXT, MSGID, VAR1 => VAL1, VAR2 => VAL2, ...
410           Like __p(), but supports variable substitution in the string, like
411           __x().
412
413               print __px("Verb, to view", "View {file}", file => $filename);
414
415           See __p() and __x() for more details.
416
417           This function was introduced in libintl-perl 1.17.
418
419       __np MSGCTXT, MSGID, MSGID_PLURAL, COUNT
420           This adds context to plural calls. It should not be needed very
421           often, if at all, due to the __nx() function. The type of variable
422           substitution used in other gettext libraries (using sprintf-like
423           sybols, like %s or %1) sometimes required context. For a (bad)
424           example of this:
425
426               printf (__np("[count] files have been deleted",
427                           "One file has been deleted.\n",
428                           "%s files have been deleted.\n",
429                           $num_files),
430                       $num_files);
431
432           NOTE: The above usage is discouraged. Just use the __nx() call,
433           which provides inline context via the key names.
434
435           This function was introduced in libintl-perl 1.17.
436
437       __npx MSGCTXT, MSGID, MSGID_PLURAL, COUNT, VAR1 => VAL1, VAR2 => VAL2,
438       ...
439           This is provided for comleteness. It adds the variable
440           interpolation into the string to the previous method, __np().
441
442           It's usage would be like so:
443
444               print __nx ("Files being permenantly removed",
445                           "One file has been deleted.\n",
446                           "{count} files have been deleted.\n",
447                           $num_files,
448                           count => $num_files);
449
450           I cannot think of any situations requiring this, but we can easily
451           support it, so here it is.
452
453           This function was introduced in libintl-perl 1.17.
454
455       N__ (ARG1, ARG2, ...)
456           A no-op function that simply echoes its arguments to the caller.
457           Take the following piece of Perl:
458
459               my @options = (
460                   "Open",
461                   "Save",
462                   "Save As",
463               );
464
465               ...
466
467               my $option = $options[1];
468
469           Now say that you want to have this translatable.  You could
470           sometimes simply do:
471
472               my @options = (
473                   __"Open",
474                   __"Save",
475                   __"Save As",
476               );
477
478               ...
479
480               my $option = $options[1];
481
482           But often times this will not be what you want, for example when
483           you also need the unmodified original string.  Sometimes it may not
484           even work, for example, when the preferred user language is not yet
485           determined at the time that the list is initialized.
486
487           In these cases you would write:
488
489               my @options = (
490                   N__"Open",
491                   N__"Save",
492                   N__"Save As",
493               );
494
495               ...
496
497               my $option = __($options[1]);
498               # or: my $option = dgettext ('my-domain', $options[1]);
499
500           Now all the strings in @options will be left alone, since N__()
501           returns its arguments (one ore more) unmodified.  Nevertheless, the
502           string extractor will be able to recognize the strings as being
503           translatable.  And you can still get the translation later by
504           passing the variable instead of the string to one of the above
505           translation functions.
506
507       N__n (MSGID, MSGID_PLURAL, COUNT)
508           Does exactly the same as N__().  You will use this form if you have
509           to mark the strings as having plural forms.
510
511       N__p (MSGCTXT, MSGID)
512           Marks MSGID as N__() does, but in the context MSGCTXT.
513
514       N__np (MSGCTXT, MSGID, MSGID_PLURAL, COUNT)
515           Marks MSGID as N__n() does, but in the context MSGCTXT.  =back
516

EXPORTED VARIABLES

518       The module exports several variables into your namespace:
519
520       %__ A tied hash.  Its keys are your original messages, the values are
521           their translations:
522
523               my $title = "<h1>$__{'My Homepage'}</h1>";
524
525           This is much better for your translation team than
526
527               my $title = __"<h1>My Homepage</h1>";
528
529           In the second case the HTML code will make it into the translation
530           database and your translators have to be aware of HTML syntax when
531           translating strings.
532
533           Warning: Do not use this hash outside of double-quoted strings!
534           The code in the tied hash object relies on the correct working of
535           the function caller() (see "perldoc -f caller"), and this function
536           will report incorrect results if the tied hash value is the
537           argument to a function from another package, for example:
538
539             my $result = Other::Package::do_it ($__{'Some string'});
540
541           The tied hash code will see "Other::Package" as the calling
542           package, instead of your own package.  Consequently it will look up
543           the message in the wrong text domain.  There is no workaround for
544           this bug.  Therefore:
545
546           Never use the tied hash interpolated strings!
547
548       $__ A reference to "%__", in case you prefer:
549
550                my $title = "<h1>$__->{'My Homepage'}</h1>";
551

PERFORMANCE

553       Message translation can be a time-consuming task.  Take this little
554       example:
555
556           1: use Locale::TextDomain ('my-domain');
557           2: use POSIX (:locale_h);
558           3:
559           4: setlocale (LC_ALL, '');
560           5: print __"Hello world!\n";
561
562       This will usually be quite fast, but in pathological cases it may run
563       for several seconds.  A worst-case scenario would be a Chinese user at
564       a terminal that understands the codeset Big5-HKSCS.  Your translator
565       for Chinese has however chosen to encode the translations in the
566       codeset EUC-TW.
567
568       What will happen at run-time?  First, the library will search and load
569       a (maybe large) message catalog for your textdomain 'my-domain'.  Then
570       it will look up the translation for "Hello world!\n", it will find that
571       it is encoded in EUC-TW.  Since that differs from the output codeset
572       Big5-HKSCS, it will first load a conversion table containing several
573       ten-thousands of codepoints for EUC-TW, then it does the same with the
574       smaller, but still very large conversion table for Big5-HKSCS, it will
575       convert the translation on the fly from EUC-TW into Big5-HKSCS, and
576       finally it will return the converted translation.
577
578       A worst-case scenario but realistic.  And for these five lines of
579       codes, there is not much you can do to make it any faster.  You should
580       understand, however, when the different steps will take place, so that
581       you can arrange your code for it.
582
583       You have learned in the section "DESCRIPTION" that line 1 is
584       responsible for locating your message database.  However, the use()
585       will do nothing more than remembering your settings.  It will not
586       search any directories, it will not load any catalogs or conversion
587       tables.
588
589       Somewhere in your code you will always have a call to
590       POSIX::setlocale(), and the performance of this call may be time-
591       consuming, depending on the architecture of your system.  On some
592       systems, this will consume very little time, on others it will only
593       consume a considerable amount of time for the first call, and on others
594       it may always be time-consuming.  Since you cannot know, how
595       setlocale() is implemented on the target system, you should reduce the
596       calls to setlocale() to a minimum.
597
598       Line 5 requests the translation for your string.  Only now, the library
599       will actually load the message catalog, and only now will it load
600       eventually needed conversion tables.  And from now on, all this
601       information will be cached in memory.  This strategy is used throughout
602       libintl-perl, and you may describe it as 'load-on-first-access'.
603       Getting the next translation will consume very little resources.
604
605       However, although the translation retrieval is somewhat obfuscated by
606       an operator-like function call, it is still a function call, and in
607       fact it even involves a chain of function calls.  Consequently, the
608       following example is probably bad practice:
609
610           foreach (1 .. 100_000) {
611               print __"Hello world!\n";
612           }
613
614       This example introduces a lot of overhead into your program.  Better do
615       this:
616
617           my $string = __"Hello world!\n";
618           foreach (1 .. 100_000) {
619               print $string;
620           }
621
622       The translation will never change, there is no need to retrieve it over
623       and over again.  Although libintl-perl will of course cache the
624       translation read from the file system, you can still avoid the overhead
625       for the function calls.
626

AUTHOR

628       Copyright (C) 2002-2009, Guido Flohr <guido@imperia.net>, all rights
629       reserved.  See the source code for details.
630
631       This software is contributed to the Perl community by Imperia
632       (<http://www.imperia.net/>).
633

SEE ALSO

635       Locale::Messages(3pm), Locale::gettext_pp(3pm), perl(1), gettext(1),
636       gettext(3)
637

POD ERRORS

639       Hey! The above document had some coding errors, which are explained
640       below:
641
642       Around line 904:
643           You forgot a '=back' before '=head1'
644
645       Around line 1050:
646           =cut found outside a pod block.  Skipping to next block.
647
648
649
650perl v5.12.0                      2010-05-02             Locale::TextDomain(3)
Impressum