1Lingua::Translit(3) User Contributed Perl Documentation Lingua::Translit(3)
2
3
4
6 Lingua::Translit - transliterates text between writing systems
7
9 use Lingua::Translit;
10
11 my $tr = new Lingua::Translit("ISO 843");
12
13 my $text_tr = $tr->translit("character oriented string");
14
15 if ($tr->can_reverse()) {
16 $text_tr = $tr->translit_reverse("character oriented string");
17 }
18
20 Lingua::Translit can be used to convert text from one writing system to
21 another, based on national or international transliteration tables.
22 Where possible a reverse transliteration is supported.
23
24 The term "transliteration" describes the conversion of text from one
25 writing system or alphabet to another one. The conversion is ideally
26 unique, mapping one character to exactly one character, so the original
27 spelling can be reconstructed. Practically this is not always the case
28 and one single letter of the original alphabet can be transcribed as
29 two, three or even more letters.
30
31 Furthermore there is more than one transliteration scheme for one
32 writing system. Therefore it is an important and necessary
33 information, which scheme will be or has been used to transliterate a
34 text, to work integrative and be able to reconstruct the original data.
35
36 Reconstruction is a problem though for non-unique transliterations, if
37 no language specific knowledge is available as the resulting clusters
38 of letters may be ambiguous. For example, the Greek character "PSI"
39 maps to "ps", but "ps" could also result from the sequence "PI",
40 "SIGMA" since "PI" maps to "p" and "SIGMA" maps to s. If a
41 transliteration table leads to ambiguous conversions, the provided
42 table cannot be used reverse.
43
44 Otherwise the table can be used in both directions, if appreciated. So
45 if ISO 9 is originally created to convert Cyrillic letters to the Latin
46 alphabet, the reverse transliteration will transform Latin letters to
47 Cyrillic.
48
50 new("name of table")
51 Initializes an object with the specific transliteration table, e.g.
52 "ISO 9".
53
54 translit("character oriented string")
55 Transliterates the given text according to the object's transliteration
56 table. Returns the transliterated text.
57
58 translit_reverse("character oriented string")
59 Transliterates the given text according to the object's transliteration
60 table, but uses it the other way round. For example table ISO 9 is a
61 transliteration scheme for the conversion of Cyrillic letters to the
62 Latin alphabet. So if used reverse, Latin letters will be mapped to
63 Cyrillic ones.
64
65 Returns the transliterated text.
66
67 can_reverse()
68 Returns true (1), iff reverse transliteration is possible. False (0)
69 otherwise.
70
71 name()
72 Returns the name of the chosen transliteration table, e.g. "ISO 9".
73
74 desc()
75 Returns a description for the transliteration, e.g. "ISO 9:1995,
76 Cyrillic to Latin".
77
79 Cyrillic
80 ALA-LC RUS, not reversible, ALA-LC:1997, Cyrillic to Latin, Russian
81
82 ISO 9, reversible, ISO 9:1995, Cyrillic to Latin
83
84 ISO/R 9, reversible, ISO 9:1954, Cyrillic to Latin
85
86 DIN 1460 RUS, reversible, DIN 1460:1982, Cyrillic to Latin, Russian
87
88 DIN 1460 UKR, reversible, DIN 1460:1982, Cyrillic to Latin,
89 Ukrainian
90
91 DIN 1460 BUL, reversible, DIN 1460:1982, Cyrillic to Latin,
92 Bulgarian
93
94 Streamlined System BUL, not reversible, The Streamlined System:
95 2006, Cyrillic to Latin, Bulgarian
96
97 GOST 7.79 RUS, reversible, GOST 7.79:2000 (table B), Cyrillic to
98 Latin, Russian
99
100 GOST 7.79 RUS OLD, not reversible, GOST 7.79:2000 (table B),
101 Cyrillic to Latin with support for Old Russian (pre 1918), Russian
102
103 GOST 7.79 UKR, reversible, GOST 7.79:2000 (table B), Cyrillic to
104 Latin, Ukrainian
105
106 BGN/PCGN RUS Standard, not reversible, BGN/PCGN:1947 (Standard
107 Variant), Cyrillic to Latin, Russian
108
109 BGN/PCGN RUS Strict, not reversible, BGN/PCGN:1947 (Strict
110 Variant), Cyrillic to Latin, Russian
111
112 Greek
113 ISO 843, not reversible, ISO 843:1997, Greek to Latin
114
115 DIN 31634, not reversible, DIN 31634:1982, Greek to Latin
116
117 Greeklish, not reversible, Greeklish (Phonetic), Greek to Latin
118
119 Latin
120 Common CES, not reversible, Czech without diacritics
121
122 Common DEU, not reversible, German without umlauts
123
124 Common POL, not reversible, Unaccented Polish
125
126 Common RON, not reversible, Romanian without diacritics as commonly
127 used
128
129 Common SLK, not reversible, Slovak without diacritics
130
131 Common SLV, not reversible, Slovenian without diacritics
132
133 ISO 8859-16 RON, reversible, Romanian with appropriate diacritics
134
135 Arabic
136 Common ARA, not reversible, Common Romanization of Arabic
137
138 Sanskrit
139 IAST Devanagari, not reversible, IAST Romanization to Devanāgarī
140
141 Devanagari IAST, not reversible, Devanāgarī to IAST Romanization
142
144 In case you want to add your own transliteration tables to
145 Lingua::Translit, have a look at the developer documentation at
146 <https://www.netzum-sorglos.de/software/lingua-translit/developer-documentation.html>.
147
148 A template of a transliteration table is provided as well
149 (xml/template.xml) so you can easily start developing.
150
152 Lingua::Translit is suited to handle Unicode and utilizes comparisons
153 and regular expressions that rely on code points. Therefore, any input
154 is supposed to be character oriented ("use utf8;", ...) instead of byte
155 oriented.
156
157 However, if your data is byte oriented, be sure to pass it UTF-8
158 encoded to translit() and/or translit_reverse() - it will be converted
159 internally.
160
162 None known.
163
164 Please report bugs using CPAN's request tracker at
165 <https://rt.cpan.org/Public/Dist/Display.html?Name=Lingua-Translit>.
166
168 Lingua::Translit::Tables, Encode, perlunicode
169
170 "translit"'s manpage
171
172 <http://www.netzum-sorglos.de/software/lingua-translit/>
173
175 Thanks to Dr. Daniel Eiwen, Romanisches Seminar, Universitaet Koeln for
176 his help on Romanian transliteration.
177
178 Thanks to Dmitry Smal and Rusar Publishing for contributing the "ALA-LC
179 RUS" transliteration table.
180
181 Thanks to Ahmed Elsheshtawy for his help implementing the "Common ARA"
182 Arabic transliteration.
183
184 Thanks to Dusan Vuckovic for contributing the "ISO/R 9" transliteration
185 table.
186
187 Thanks to Ștefan Suciu for contributing the "ISO 8859-16 RON"
188 transliteration table.
189
190 Thanks to Philip Kime for contributing the "IAST Devanagari" and
191 "Devanagari IAST" transliteration tables.
192
193 Thanks to Nikola Lečić for contributing the "BGN/PCGN RUS Standard" and
194 "BGN/PCGN RUS Strict" transliteration tables.
195
197 Alex Linke <alinke@netzum-sorglos.de>
198
199 Rona Linke <rlinke@netzum-sorglos.de>
200
202 Copyright (C) 2007-2008 Alex Linke and Rona Linke
203
204 Copyright (C) 2009-2016 Lingua-Systems Software GmbH
205
206 Copyright (C) 2016-2017 Netzum Sorglos, Lingua-Systems Software GmbH
207
208 Copyright (C) 2017-2022 Netzum Sorglos Software GmbH
209
210 This module is free software; you can redistribute it and/or modify it
211 under the same terms as Perl itself.
212
213
214
215perl v5.34.1 2022-06-16 Lingua::Translit(3)