1Locale::Codes::Types(3)User Contributed Perl DocumentatioLnocale::Codes::Types(3)
2
3
4

NAME

6       Locale::Codes::Types - types of data sets supported
7

DESCRIPTION

9       This document contains a description of different types of code sets
10       supported by the Locale-Codes distribution.
11
12       The following types are supported:
13
14       "country"
15       "language"
16       "currency"
17       "script"
18       "langfam"
19       "langvar"
20       "langext"
21
22       Any time you have to specify the type of data, use one of the values
23       from this list.  When using the OO interface, you have to specify the
24       type of data you are working with.  For example:
25
26          use Locale::Codes;
27          ...
28          $obj->type('country');
29          $obj->type('langext');
30
31       When using the traditional interfaces, the functions all have the data
32       type included in the function name.  For example:
33
34          use Locale::Codes::Country;
35          code2country(...);
36
37          use Locale::Codes::LangExt;
38          code2langext(...);
39
40       Each type of data may have any number of code sets.  Code sets may be
41       specified by name.  Traditionally, a perl constant was exported and
42       could also be used to specify the code set.
43
44       Both methods are available for both the OO and traditional interfaces,
45       so whenever a function or method takes an argument specifying a code
46       set, either the name or a constant can be used.
47
48       In the lists below, a code set is specified in the list by including
49       the name and the constant.  So, for example, the first country code set
50       is named 'alpha-2' and has a perl constant "LOCALE_COUNTRY_ALPHA_2"
51       associated with it.  When using the OO interface, the constants are
52       only available if you import them by loading the module with:
53
54          use Locale::Codes ':constants';
55
56       The constants are always available when using the traditional
57       interfaces.
58
59       Some of the older perl constants names were not consistent, and in
60       those cases, two constants are available (a newer consistent name and
61       the older inconsistent one).  Either may be used.
62
63       The default code set for each type is marked with an asterisk (*).
64

country

66       Code sets for identifying countries are maintained by several different
67       agencies and standards.
68
69       The following code sets are maintained in the ISO 3166 standard.  The
70       official home page for the ISO 3166 maintenance agency is:
71       <http://www.iso.org/iso/home/standards/country_codes.htm> .
72
73       Only the officially assigned codes are included.
74
75       * alpha-2, LOCALE_COUNTRY_ALPHA_2, LOCALE_CODE_ALPHA_2
76           This is the set of two-letter (lowercase) codes from ISO 3166-1,
77           such as 'tv' for Tuvalu.
78
79       alpha-3, LOCALE_COUNTRY_ALPHA_3, LOCALE_CODE_ALPHA_3
80           This is the set of three-letter (lowercase) codes from ISO 3166-1,
81           such as 'brb' for Barbados. These codes are actually defined and
82           maintained by the U.N. Statistics division.
83
84       numeric, LOCALE_COUNTRY_NUMERIC, LOCALE_CODE_NUMERIC
85           This is the set of three-digit numeric codes from ISO 3166-1, such
86           as 064 for Bhutan.
87
88           If a 2-digit code is entered, it is converted to 3 digits by
89           prepending a 0.
90
91       A list of domain names are maintained by the IANA (Internet Assigned
92       Numbers Authority).  These are available at:
93       <http://www.iana.org/domains/root/db/> .  Only the actual country codes
94       are used, and the country names come from ISO 3166.
95
96       dom, LOCALE_COUNTRY_DOM, LOCALE_CODE_DOM
97           The country domains assigned by IANA are usually the two-letter
98           (lowercase) codes from ISO 3166, but there are a few other
99           additions.
100
101       The United Nations also maintains country lists.  Their list is also
102       similar, but not identical, to the ISO 3166 list.
103
104       The data is available here:
105       <https://unstats.un.org/unsd/methodology/m49/>
106
107       Previously, this table was treated as a source of the ISO 3166 data,
108       but I found that the table was incomplete, so I stopped using it.
109       Later, it was added back in as it's own list of codes.
110
111       un-alpha-3, LOCALE_COUNTRY_UN_ALPHA_3, LOCALE_CODE_UN_ALPHA_3
112           This is similar to the 'alpha-3' set from ISO 3166, except that the
113           codes are uppercase.
114
115       un-numeric, LOCALE_COUNTRY_UN_NUMERIC, LOCALE_CODE_UN_NUMERIC
116           This is similar to the 'numeric' set from ISO 3166.
117
118       The US Government also keeps a list of codes.  Originally, it
119       maintained the FIPS-11 code set, but this was deprecated and replaced
120       by the GENC code set.  The FIPS-11 code sets are no longer supported by
121       Locale-Codes.
122
123       The GENC code sets are available here:
124       <https://nsgreg.nga.mil/genc/discovery> .  They are also similar, but
125       not identical, to the ISO 3166 code sets.
126
127       genc-alpha-2, LOCALE_COUNTRY_GENC_ALPHA_2, LOCALE_CODE_GENC_ALPHA_2
128           Similar to the 'alpha-2' set, but uppercase.
129
130       genc-alpha-3, LOCALE_COUNTRY_GENC_ALPHA_3, LOCALE_CODE_GENC_ALPHA_3
131           Similar to the 'alpha-3' set, but uppercase.
132
133       genc-numeric, LOCALE_COUNTRY_GENC_NUMERIC, LOCALE_CODE_GENC_NUMERIC
134           Similar to the 'numeric' set.
135
136       There are other sources of codes that are not currently used in this
137       distribution.
138
139       ISO codes for country sub-divisions (states, counties, provinces, etc),
140       as defined in ISO 3166-2.  This module is not part of the Locale-Codes
141       distribution, but is available from CPAN in
142       CPAN/modules/by-module/Locale/
143
144       The World Factbook maintained by the CIA is a potential source of the
145       data.  Unfortunately, it adds/preserves non-standard codes, so it is
146       not used as a source of data.
147       <https://www.cia.gov/library/publications/the-world-factbook/appendix/appendix-d.html>
148
149       Another unofficial source of data is the Statoids web site:
150       <http://www.statoids.com/wab.html> . Currently, it is not used to get
151       data, but the notes and explanatory material were very useful for
152       understanding discrepancies between the sources.
153

language

155       Code sets for identifying languages come from a couple different
156       locations.
157
158       The primary source is ISO 639 .  The ISO 639-2 codes are available
159       here: <http://www.loc.gov/standards/iso639-2/> and the ISO 639-5 codes
160       are available here: <http://www.loc.gov/standards/iso639-5/> .
161
162       In addition, the IANA maintains a language registry which are added to
163       the ISO lists.  Because it is intended to supplement the ISO standard,
164       the IANA list is not separate.
165
166       The IANA data is available here:
167       <http://www.iana.org/assignments/language-subtag-registry>
168
169       The code sets are:
170
171       * alpha-2, LOCALE_LANGUAGE_ALPHA_2, LOCALE_LANG_ALPHA_2
172           This is the set of two-letter (lowercase) codes from ISO 639-1,
173           such as 'he' for Hebrew.  It also includes additions to this set
174           included in the IANA language registry.
175
176       alpha-3, LOCALE_LANGUAGE_ALPHA_3, LOCALE_LANG_ALPHA_3
177           This is the set of three-letter (lowercase) bibliographic codes
178           from ISO 639-2 and 639-5, such as 'heb' for Hebrew.  It also
179           includes additions to this set included in the IANA language
180           registry.
181
182       term, LOCALE_LANGUAGE_TERM, LOCALE_LANG_TERM
183           This is the set of three-letter (lowercase) terminologic codes from
184           ISO 639.
185

currency

187       The source of currency codes is the ISO 4217 data available here:
188       <https://www.six-group.com/en/products-services/financial-information/data-standards.html>
189
190       The code sets are:
191
192       * alpha, LOCALE_CURRENCY_ALPHA, LOCALE_CURR_ALPHA
193           This is a set of three-letter (uppercase) codes from ISO 4217 such
194           as EUR for Euro.
195
196           Two of the codes specified by the standard (XTS which is reserved
197           for testing purposes and XXX which is for transactions where no
198           currency is involved) are omitted.
199
200       num, LOCALE_CURRENCY_NUMERIC, LOCALE_CURR_NUMERIC
201           This is the set of three-digit numeric codes from ISO 4217.
202

script

204       The source of script code sets is ISO 15924 available here:
205       <http://www.unicode.org/iso15924/>
206
207       Additional data comes from the IANA language subtag registry:
208       <http://www.iana.org/assignments/language-subtag-registry> .
209
210       Code sets are:
211
212       * alpha, LOCALE_SCRIPT_ALPHA
213           This is a set of four-letter (capitalized) codes from ISO 15924
214           such as 'Phnx' for Phoenician.  It also includes additions to this
215           set included in the IANA language registry.
216
217           The Zxxx, Zyyy, and Zzzz codes are not used.
218
219       num, LOCALE_SCRIPT_NUMERIC
220           This is a set of three-digit numeric codes from ISO 15924 such as
221           115 for Phoenician.
222

langfam

224       Language families are specified using codes from ISO 639-5 available
225       here: <http://www.loc.gov/standards/iso639-5/id.php>
226
227       Code sets are:
228
229       * alpha, LOCALE_LANGFAM_ALPHA
230           This is the set of three-letter (lowercase) codes from ISO 639-5
231           such as 'apa' for Apache languages.
232

langvar

234       Language variations are specified using codes from he IANA language
235       subtag registry available here:
236       <http://www.iana.org/assignments/language-subtag-registry>
237
238       Code sets are:
239
240       * alpha, LOCALE_LANGVAR_ALPHA
241           This is the set of alphanumeric codes from the IANA language
242           registry, such as 'arevela' for Eastern Armenian.
243

langext

245       Language extensions are specified using codes from he IANA language
246       subtag registry available here:
247       <http://www.iana.org/assignments/language-subtag-registry>
248
249       Code sets are:
250
251       * alpha, LOCALE_LANGEXT_ALPHA
252           This is the set of three-letter (lowercase) codes from the IANA
253           language registry, such as 'acm' for Mesopotamian Arabic.
254

NEW CODE SETS

256       I'm always open to suggestions for new code sets.
257
258       In order for me to add a code set, I want the following criteria to be
259       met:
260
261       General-use code set
262           If a code set is not general use, I'm not likely to spend the time
263           to add and support it.
264
265       An official source of data
266           I require an official (or at least, a NEARLY official) source where
267           I can get the data on a regular basis.
268
269           Ideally, I'd only get data from an official source, but sometimes
270           that is not possible. For example the ISO standards are not
271           typically available for free, so I may have to get some of that
272           data from alternate sources that I'm confident are getting their
273           data from the official source.  However, I will always be hesitant
274           to accept a non-official source.
275
276           As an example, I used to get some country data from the CIA World
277           Factbook. Given the nature of the source, I'm sure they're updating
278           data from the official sources and I consider it "nearly" official.
279           However, even in this case, I found that they were adding codes
280           that were not part of the standard, so I have stopped using them as
281           a source.
282
283           There are many 3rd party sites which maintain lists (many of which
284           are actually in a more convenient form than the official sites).
285           Unfortunately, I will reject most of them since I have no feel for
286           how "official" they are.
287
288       A free source of the data
289           Obviously, the data must be free-of-charge. I'm not interested in
290           paying for the data (and I'm not interested in the overhead of
291           having someone else pay for the data for me).
292
293       A reliable source of data
294           The source of data must come from a source that I can reasonably
295           expect to exist for the foreseeable future since I will be
296           extremely reluctant to drop support for a data set once it's
297           included.
298
299           I am also reluctant to accept data sent to me by an individual.
300           Although I appreciate the offer, it is simply not practical to
301           consider an individual contribution as a reliable source of data.
302           The source should be an official agency of some sort.
303
304       These requirements are open to discussion. If you have a code set you'd
305       like to see added, but which may not meet all of the above
306       requirements, feel free to email me and we'll discuss it.  Depending on
307       circumstances, I may be willing to waive some of these criteria.
308

SEE ALSO

310       Locale::Codes
311           The Locale-Codes distribution.
312

AUTHOR

314       See Locale::Codes for full author history.
315
316       Currently maintained by Sullivan Beck (sbeck@cpan.org).
317
319          Copyright (c) 1997-2001 Canon Research Centre Europe (CRE).
320          Copyright (c) 2001-2010 Neil Bowers
321          Copyright (c) 2010-2023 Sullivan Beck
322
323       This module is free software; you can redistribute it and/or modify it
324       under the same terms as Perl itself.
325
326
327
328perl v5.36.1                      2023-06-08           Locale::Codes::Types(3)
Impressum