1locale(5) Standards, Environments, and Macros locale(5)
2
3
4
6 locale - subset of a user's environment that depends on language and
7 cultural conventions
8
10 A locale is the definition of the subset of a user's environment that
11 depends on language and cultural conventions. It is made up from one or
12 more categories. Each category is identified by its name and controls
13 specific aspects of the behavior of components of the system. Category
14 names correspond to the following environment variable names:
15
16 LC_CTYPE Character classification and case conversion.
17
18
19 LC_COLLATE Collation order.
20
21
22 LC_TIME Date and time formats.
23
24
25 LC_NUMERIC Numeric formatting.
26
27
28 LC_MONETARY Monetary formatting.
29
30
31 LC_MESSAGES Formats of informative and diagnostic messages and
32 interactive responses.
33
34
35
36 The standard utilities base their behavior on the current locale, as
37 defined in the ENVIRONMENT VARIABLES section for each utility. The
38 behavior of some of the C-language functions will also be modified
39 based on the current locale, as defined by the last call to setlo‐
40 cale(3C).
41
42
43 Locales other than those supplied by the implementation can be created
44 by the application via the localedef(1) utility. The value that is used
45 to specify a locale when using environment variables will be the string
46 specified as the name operand to localedef when the locale was cre‐
47 ated. The strings "C" and "POSIX" are reserved as identifiers for the
48 POSIX locale.
49
50
51 Applications can select the desired locale by invoking the setlocale()
52 function with the appropriate value. If the function is invoked with an
53 empty string, such as:
54
55 setlocale(LC_ALL, "");
56
57
58
59 the value of the corresponding environment variable is used. If the
60 environment variable is unset or is set to the empty string, the set‐
61 locale() function sets the appropriate environment.
62
63 Locale Definition
64 Locales can be described with the file format accepted by the localedef
65 utility.
66
67
68 The locale definition file must contain one or more locale category
69 source definitions, and must not contain more than one definition for
70 the same locale category.
71
72
73 A category source definition consists of a category header, a category
74 body and a category trailer. A category header consists of the charac‐
75 ter string naming of the category, beginning with the characters LC_.
76 The category trailer consists of the string END, followed by one or
77 more blank characters and the string used in the corresponding category
78 header.
79
80
81 The category body consists of one or more lines of text. Each line con‐
82 tains an identifier, optionally followed by one or more operands. Iden‐
83 tifiers are either keywords, identifying a particular locale element,
84 or collating elements. Each keyword within a locale must have a unique
85 name (that is, two categories cannot have a commonly-named keyword). No
86 keyword can start with the characters LC_. Identifiers must be sepa‐
87 rated from the operands by one or more blank characters.
88
89
90 Operands must be characters, collating elements, or strings of charac‐
91 ters. Strings must be enclosed in double-quotes ("). Literal double-
92 quotes within strings must be preceded by the <escape character>, as
93 described below. When a keyword is followed by more than one operand,
94 the operands must be separated by semicolons (;). Blank characters are
95 allowed both before and after a semicolon.
96
97
98 The first category header in the file can be preceded by a line modify‐
99 ing the comment character. It has the following format, starting in
100 column 1:
101
102 "comment_char %c\n",<comment character>
103
104
105
106 The comment character defaults to the number sign (#). Blank lines and
107 lines containing the <comment character> in the first position are
108 ignored.
109
110
111 The first category header in the file can be preceded by a line modify‐
112 ing the escape character to be used in the file. It has the following
113 format, starting in column 1:
114
115 "escape_char %c\n",<escape character>
116
117
118
119
120 The escape character defaults to backslash.
121
122
123 A line can be continued by placing an escape character as the last
124 character on the line; this continuation character will be discarded
125 from the input. Although the implementation need not accept any one
126 portion of a continued line with a length exceeding {LINE_MAX} bytes,
127 it places no limits on the accumulated length of the continued line.
128 Comment lines cannot be continued on a subsequent line using an escaped
129 newline character.
130
131
132 Individual characters, characters in strings, and collating elements
133 must be represented using symbolic names, as defined below. In addi‐
134 tion, characters can be represented using the characters themselves or
135 as octal, hexadecimal or decimal constants. When non-symbolic notation
136 is used, the resultant locale definitions will in many cases not be
137 portable between systems. The left angle bracket (<) is a reserved sym‐
138 bol, denoting the start of a symbolic name; when used to represent
139 itself it must be preceded by the escape character. The following rules
140 apply to character representation:
141
142 1. A character can be represented via a symbolic name, enclosed
143 within angle brackets < and >. The symbolic name, including
144 the angle brackets, must exactly match a symbolic name
145 defined in the charmap file specified via the localedef -f
146 option, and will be replaced by a character value determined
147 from the value associated with the symbolic name in the
148 charmap file. The use of a symbolic name not found in the
149 charmap file constitutes an error, unless the category is
150 LC_CTYPE or LC_COLLATE, in which case it constitutes a
151 warning condition (see localedef(1) for a description of
152 action resulting from errors and warnings). The specifica‐
153 tion of a symbolic name in a collating-element or collating-
154 symbol section that duplicates a symbolic name in the
155 charmap file (if present) is an error. Use of the escape
156 character or a right angle bracket within a symbolic name is
157 invalid unless the character is preceded by the escape char‐
158 acter.
159
160 Example:
161
162 <C>;<c-cedilla> "<M><a><y>"
163
164
165
166 2. A character can be represented by the character itself, in
167 which case the value of the character is implementation-
168 dependent. Within a string, the double-quote character, the
169 escape character and the right angle bracket character must
170 be escaped (preceded by the escape character) to be inter‐
171 preted as the character itself. Outside strings, the charac‐
172 ters
173
174 , ; < > escape_char
175
176
177 must be escaped to be interpreted as the character itself.
178
179 Example:
180
181 c "May"
182
183
184
185 3. A character can be represented as an octal constant. An
186 octal constant is specified as the escape character followed
187 by two or more octal digits. Each constant represents a byte
188 value. Multi-byte values can be represented by concatenated
189 constants specified in byte order with the last constant
190 specifying the least significant byte of the character.
191
192 Example:
193
194 \143;\347;\143\150 "\115\141\171"
195
196
197
198 4. A character can be represented as a hexadecimal constant. A
199 hexadecimal constant is specified as the escape character
200 followed by an x followed by two or more hexadecimal digits.
201 Each constant represents a byte value. Multi-byte values can
202 be represented by concatenated constants specified in byte
203 order with the last constant specifying the least signifi‐
204 cant byte of the character.
205
206 Example:
207
208 \x63;\xe7;\x63\x68 "\x4d\x61\x79"
209
210
211
212 5. A character can be represented as a decimal constant. A dec‐
213 imal constant is specified as the escape character followed
214 by a d followed by two or more decimal digits. Each constant
215 represents a byte value. Multi-byte values can be repre‐
216 sented by concatenated constants specified in byte order
217 with the last constant specifying the least significant byte
218 of the character.
219
220 Example:
221
222 \d99;\d231;\d99\d104 "\d77\d97\d121"
223
224
225 Only characters existing in the character set for which the
226 locale definition is created can be specified, whether using
227 symbolic names, the characters themselves, or octal, decimal
228 or hexadecimal constants. If a charmap file is present, only
229 characters defined in the charmap can be specified using
230 octal, decimal or hexadecimal constants. Symbolic names not
231 present in the charmap file can be specified and will be
232 ignored, as specified under item 1 above.
233
234 LC_CTYPE
235 The LC_CTYPE category defines character classification, case conver‐
236 sion and other character attributes. In addition, a series of charac‐
237 ters can be represented by three adjacent periods representing an
238 ellipsis symbol (...). The ellipsis specification is interpreted as
239 meaning that all values between the values preceding and following it
240 represent valid characters. The ellipsis specification is valid only
241 within a single encoded character set, that is, within a group of char‐
242 acters of the same size. An ellipsis is interpreted as including in the
243 list all characters with an encoded value higher than the encoded value
244 of the character preceding the ellipsis and lower than the encoded
245 value of the character following the ellipsis.
246
247
248 Example:
249
250 \x30;...;\x39;
251
252
253
254
255 includes in the character class all characters with encoded values
256 between the endpoints.
257
258
259 The following keywords are recognized. In the descriptions, the term
260 ``automatically included'' means that it is not an error either to
261 include or omit any of the referenced characters.
262
263
264 The character classes digit, xdigit, lower, upper, and space have a set
265 of automatically included characters. These only need to be specified
266 if the character values (that is, encoding) differ from the implementa‐
267 tion default values.
268
269 upper Define characters to be classified as upper-case let‐
270 ters.
271
272 In the POSIX locale, the 26 upper-case letters are
273 included:
274
275 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
276
277
278 In a locale definition file, no character specified
279 for the keywords cntrl, digit, punct, or space can be
280 specified. The upper-case letters A to Z are automat‐
281 ically included in this class.
282
283
284 lower Define characters to be classified as lower-case let‐
285 ters. In the POSIX locale, the 26 lower-case letters
286 are included:
287
288 a b c d e f g h i j k l m n o p q r s t u v w x y z
289
290
291 In a locale definition file, no character specified
292 for the keywords cntrl, digit, punct, or space can be
293 specified. The lower-case letters a to z of the por‐
294 table character set are automatically included in
295 this class.
296
297
298 alpha Define characters to be classified as letters.
299
300 In the POSIX locale, all characters in the classes
301 upper and lower are included.
302
303 In a locale definition file, no character specified
304 for the keywords cntrl, digit, punct, or space can be
305 specified. Characters classified as either upper or
306 lower are automatically included in this class.
307
308
309 digit Define the characters to be classified as numeric
310 digits.
311
312 In the POSIX locale, only
313
314 0 1 2 3 4 5 6 7 8 9
315
316
317 are included.
318
319 In a locale definition file, only the digits 0, 1, 2,
320 3, 4, 5, 6, 7, 8, and 9 can be specified, and in con‐
321 tiguous ascending sequence by numerical value. The
322 digits 0 to 9 of the portable character set are auto‐
323 matically included in this class.
324
325 The definition of character class digit requires that
326 only ten characters; the ones defining digits can be
327 specified; alternative digits (for example, Hindi or
328 Kanji) cannot be specified here.
329
330
331 alnum Define characters to be classified as letters and
332 numeric digits. Only the characters specified for the
333 alpha and digit keywords are specified. Characters
334 specified for the keywords alpha and digit are auto‐
335 matically included in this class.
336
337
338 space Define characters to be classified as white-space
339 characters.
340
341 In the POSIX locale, at a minimum, the characters
342 SPACE, FORMFEED, NEWLINE, CARRIAGE RETURN, TAB, and
343 VERTICAL TAB are included.
344
345 In a locale definition file, no character specified
346 for the keywords upper, lower, alpha, digit, graph,
347 or xdigit can be specified. The characters SPACE,
348 FORMFEED, NEWLINE, CARRIAGE RETURN, TAB, and VERTI‐
349 CAL TAB of the portable character set, and any char‐
350 acters included in the class blank are automatically
351 included in this class.
352
353
354 cntrl Define characters to be classified as control charac‐
355 ters.
356
357 In the POSIX locale, no characters in classes alpha
358 or print are included.
359
360 In a locale definition file, no character specified
361 for the keywords upper, lower, alpha, digit, punct,
362 graph, print, or xdigit can be specified.
363
364
365 punct Define characters to be classified as punctuation
366 characters.
367
368 In the POSIX locale, neither the space character nor
369 any characters in classes alpha, digit, or cntrl are
370 included.
371
372 In a locale definition file, no character specified
373 for the keywords upper, lower, alpha, digit, cntrl,
374 xdigit or as the space character can be specified.
375
376
377 graph Define characters to be classified as printable char‐
378 acters, not including the space character.
379
380 In the POSIX locale, all characters in classes alpha,
381 digit, and punct are included; no characters in class
382 cntrl are included.
383
384 In a locale definition file, characters specified for
385 the keywords upper, lower, alpha, digit, xdigit, and
386 punct are automatically included in this class. No
387 character specified for the keyword cntrl can be
388 specified.
389
390
391 print Define characters to be classified as printable char‐
392 acters, including the space character.
393
394 In the POSIX locale, all characters in class graph
395 are included; no characters in class cntrl are
396 included.
397
398 In a locale definition file, characters specified for
399 the keywords upper, lower, alpha, digit, xdigit,
400 punct, and the space character are automatically
401 included in this class. No character specified for
402 the keyword cntrl can be specified.
403
404
405 xdigit Define the characters to be classified as hexadecimal
406 digits.
407
408 In the POSIX locale, only:
409
410 0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f
411
412
413 are included.
414
415 In a locale definition file, only the characters
416 defined for the class digit can be specified, in con‐
417 tiguous ascending sequence by numerical value, fol‐
418 lowed by one or more sets of six characters repre‐
419 senting the hexadecimal digits 10 to 15 inclusive,
420 with each set in ascending order (for example A, B,
421 C, D, E, F, a, b, c, d, e, f). The digits 0 to 9, the
422 upper-case letters A to F and the lower-case letters
423 a to f of the portable character set are automati‐
424 cally included in this class.
425
426 The definition of character class xdigit requires
427 that the characters included in character class digit
428 be included here also.
429
430
431 blank Define characters to be classified as blank charac‐
432 ters.
433
434 In the POSIX locale, only the space and tab charac‐
435 ters are included.
436
437 In a locale definition file, the characters space and
438 tab are automatically included in this class.
439
440
441 charclass Define one or more locale-specific character class
442 names as strings separated by semi-colons. Each named
443 character class can then be defined subsequently in
444 the LC_CTYPE definition. A character class name con‐
445 sists of at least one and at most {CHAR‐
446 CLASS_NAME_MAX} bytes of alphanumeric characters from
447 the portable filename character set. The first char‐
448 acter of a character class name cannot be a digit.
449 The name cannot match any of the LC_CTYPE keywords
450 defined in this document.
451
452
453 charclass-name Define characters to be classified as belonging to
454 the named locale-specific character class. In the
455 POSIX locale, the locale-specific named character
456 classes need not exist. If a class name is defined by
457 a charclass keyword, but no characters are subse‐
458 quently assigned to it, this is not an error; it rep‐
459 resents a class without any characters belonging to
460 it. The charclass-name can be used as the property
461 argument to the wctype(3C) function, in regular
462 expression and shell pattern-matching bracket expres‐
463 sions, and by the tr(1) command.
464
465
466 toupper Define the mapping of lower-case letters to upper-
467 case letters.
468
469 In the POSIX locale, at a minimum, the 26 lower-case
470 characters:
471
472 a b c d e f g h i j k l m n o p q r s t u v w x y z
473
474
475 are mapped to the corresponding 26 upper-case charac‐
476 ters:
477
478 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
479
480
481 In a locale definition file, the operand consists of
482 character pairs, separated by semicolons. The charac‐
483 ters in each character pair are separated by a comma
484 and the pair enclosed by parentheses. The first char‐
485 acter in each pair is the lower-case letter, the sec‐
486 ond the corresponding upper-case letter. Only charac‐
487 ters specified for the keywords lower and upper can
488 be specified. The lower-case letters a to z, and
489 their corresponding upper-case letters A to Z, of the
490 portable character set are automatically included in
491 this mapping, but only when the toupper keyword is
492 omitted from the locale definition.
493
494
495 tolower Define the mapping of upper-case letters to lower-
496 case letters.
497
498 In the POSIX locale, at a minimum, the 26 upper-case
499 characters:
500
501 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
502
503
504 are mapped to the corresponding 26 lower-case charac‐
505 ters:
506
507 a b c d e f g h i j k l m n o p q r s t u v w x y z
508
509
510 In a locale definition file, the operand consists of
511 character pairs, separated by semicolons. The charac‐
512 ters in each character pair are separated by a comma
513 and the pair enclosed by parentheses. The first char‐
514 acter in each pair is the upper-case letter, the sec‐
515 ond the corresponding lower-case letter. Only charac‐
516 ters specified for the keywords lower and upper can
517 be specified. If the tolower keyword is omitted from
518 the locale definition, the mapping will be the
519 reverse mapping of the one specified for toupper.
520
521
522 LC_COLLATE
523 The LC_COLLATE category provides a collation sequence definition for
524 numerous utilities (such as sort(1), uniq(1), and so forth), regular
525 expression matching (see regex(5)), and the strcoll(3C), strxfrm(3C),
526 wcscoll(3C), and wcsxfrm(3C) functions.
527
528
529 A collation sequence definition defines the relative order between col‐
530 lating elements (characters and multi-character collating elements) in
531 the locale. This order is expressed in terms of collation values, that
532 is, by assigning each element one or more collation values (also known
533 as collation weights). The following capabilities are provided:
534
535 1. Multi-character collating elements. Specification of multi-
536 character collating elements (that is, sequences of two or
537 more characters to be collated as an entity).
538
539 2. User-defined ordering of collating elements. Each collating
540 element is assigned a collation value defining its order in
541 the character (or basic) collation sequence. This ordering
542 is used by regular expressions and pattern matching and,
543 unless collation weights are explicity specified, also as
544 the collation weight to be used in sorting.
545
546 3. Multiple weights and equivalence classes. Collating elements
547 can be assigned one or more (up to the limit
548 {COLL_WEIGHTS_MAX} ) collating weights for use in sorting.
549 The first weight is hereafter referred to as the primary
550 weight.
551
552 4. One-to-Many mapping. A single character is mapped into a
553 string of collating elements.
554
555 5. Equivalence class definition. Two or more collating elements
556 have the same collation value (primary weight).
557
558 6. Ordering by weights. When two strings are compared to deter‐
559 mine their relative order, the two strings are first broken
560 up into a series of collating elements. The elements in each
561 successive pair of elements are then compared according to
562 the relative primary weights for the elements. If equal, and
563 more than one weight has been assigned, the pairs of collat‐
564 ing elements are recompared according to the relative subse‐
565 quent weights, until either a pair of collating elements
566 compare unequal or the weights are exhausted.
567
568
569 The following keywords are recognized in a collation sequence defini‐
570 tion. They are described in detail in the following sections.
571
572 copy Specify the name of an existing locale which is
573 used as the definition of this category. If this
574 keyword is specified, no other keyword is speci‐
575 fied.
576
577
578 collating-element Define a collating-element symbol representing a
579 multi-character collating element. This keyword is
580 optional.
581
582
583 collating-symbol Define a collating symbol for use in collation
584 order statements. This keyword is optional.
585
586
587 order_start Define collation rules. This statement is followed
588 by one or more collation order statements, assign‐
589 ing character collation values and collation
590 weights to collating elements.
591
592
593 order_end Specify the end of the collation-order statements.
594
595
596 collating-element keyword
597 In addition to the collating elements in the character set, the collat‐
598 ing-element keyword is used to define multi-character collating ele‐
599 ments. The syntax is:
600
601 "collating-element %s from \"%s\"\n",<collating-symbol>,<string>
602
603
604
605 The <collating-symbol> operand is a symbolic name, enclosed between
606 angle brackets (< and >), and must not duplicate any symbolic name in
607 the current charmap file (if any), or any other symbolic name defined
608 in this collation definition. The string operand is a string of two or
609 more characters that collates as an entity. A <collating-element>
610 defined via this keyword is only recognized with the LC_COLLATE cate‐
611 gory.
612
613
614 Example:
615 collating-element <ch> from "<c><h>"
616 collating-element <e-acute> from "<acute><e>"
617 collating-element <ll> from "ll"
618
619 collating-symbol keyword
620 This keyword will be used to define symbols for use in collation
621 sequence statements; that is, between the order_start and the order_end
622 keywords. The syntax is:
623
624 "collating-symbol %s\n",<collating-symbol>
625
626
627
628 The <collating-symbol> is a symbolic name, enclosed between angle
629 brackets (< and >), and must not duplicate any symbolic name in the
630 current charmap file (if any), or any other symbolic name defined in
631 this collation definition.
632
633
634 A collating-symbol defined via this keyword is only recognized with the
635 LC_COLLATE category.
636
637
638 Example:
639 collating-symbol <UPPER_CASE>
640 collating-symbol <HIGH>
641
642
643 The collating-symbol keyword defines a symbolic name that can be asso‐
644 ciated with a relative position in the character order sequence. While
645 such a symbolic name does not represent any collating element, it can
646 be used as a weight.
647
648 order_start keyword
649 The order_start keyword must precede collation order entries and also
650 defines the number of weights for this collation sequence definition
651 and other collation rules.
652
653
654 The syntax of the order_start keyword is:
655
656 "order_start %s;%s;...;%s\n",<sort-rules>,<sort-rules>
657
658
659
660 The operands to the order_start keyword are optional. If present, the
661 operands define rules to be applied when strings are compared. The num‐
662 ber of operands define how many weights each element is assigned. If no
663 operands are present, one forward operand is assumed. If present, the
664 first operand defines rules to be applied when comparing strings using
665 the first (primary) weight; the second when comparing strings using the
666 second weight, and so on. Operands are separated by semicolons (;).
667 Each operand consists of one or more collation directives, separated by
668 commas (,). If the number of operands exceeds the {COLL_WEIGHTS_MAX}
669 limit, the utility will issue a warning message. The following direc‐
670 tives will be supported:
671
672 forward Specifies that comparison operations for the weight level
673 proceed from start of string towards the end of string.
674
675
676 backward Specifies that comparison operations for the weight level
677 proceed from end of string towards the beginning of string.
678
679
680 position Specifies that comparison operations for the weight level
681 will consider the relative position of elements in the
682 strings not subject to IGNORE. The string containing an
683 element not subject to IGNORE after the fewest collating
684 elements subject to IGNORE from the start of the compare
685 will collate first. If both strings contain a character not
686 subject to IGNORE in the same relative position, the col‐
687 lating values assigned to the elements will determine the
688 ordering. In case of equality, subsequent characters not
689 subject to IGNORE are considered in the same manner.
690
691
692
693 The directives forward and backward are mutually exclusive.
694
695
696 Example:
697
698 order_start forward;backward
699
700
701
702
703 If no operands are specified, a single forward operand is assumed.
704
705 Collation Order
706 The order_start keyword is followed by collating identifier entries.
707 The syntax for the collating element entries is:
708
709 "%s %s;%s;...;%s\n"<collating-identifier>,<weight>,<weight>,...
710
711
712
713 Each collating-identifier consists of either a character described in
714 Locale Definition above, a <collating-element>, a <collating-symbol>,
715 an ellipsis, or the special symbol UNDEFINED. The order in which col‐
716 lating elements are specified determines the character order sequence,
717 such that each collating element compares less than the elements fol‐
718 lowing it. The NUL character compares lower than any other character.
719
720
721 A <collating-element> is used to specify multi-character collating ele‐
722 ments, and indicates that the character sequence specified via the
723 <collating-element> is to be collated as a unit and in the relative
724 order specified by its place.
725
726
727 A <collating-symbol> is used to define a position in the relative order
728 for use in weights. No weights are specified with a <collating-symbol>.
729
730
731 The ellipsis symbol specifies that a sequence of characters will col‐
732 late according to their encoded character values. It is interpreted as
733 indicating that all characters with a coded character set value higher
734 than the value of the character in the preceding line, and lower than
735 the coded character set value for the character in the following line,
736 in the current coded character set, will be placed in the character
737 collation order between the previous and the following character in
738 ascending order according to their coded character set values. An ini‐
739 tial ellipsis is interpreted as if the preceding line specified the NUL
740 character, and a trailing ellipsis as if the following line specified
741 the highest coded character set value in the current coded character
742 set. An ellipsis is treated as invalid if the preceding or following
743 lines do not specify characters in the current coded character set. The
744 use of the ellipsis symbol ties the definition to a specific coded
745 character set and may preclude the definition from being portable
746 beween implementations.
747
748
749 The symbol UNDEFINED is interpreted as including all coded character
750 set values not specified explicitly or via the ellipsis symbol. Such
751 characters are inserted in the character collation order at the point
752 indicated by the symbol, and in ascending order according to their
753 coded character set values. If no UNDEFINED symbol is specified, and
754 the current coded character set contains characters not specified in
755 this section, the utility will issue a warning message and place such
756 characters at the end of the character collation order.
757
758
759 The optional operands for each collation-element are used to define the
760 primary, secondary, or subsequent weights for the collating element.
761 The first operand specifies the relative primary weight, the second the
762 relative secondary weight, and so on. Two or more collation-elements
763 can be assigned the same weight; they belong to the same equivalence
764 class if they have the same primary weight. Collation behaves as if,
765 for each weight level, elements subject to IGNORE are removed, unless
766 the position collation directive is specified for the corresponding
767 level with the order_start keyword. Then each successive pair of ele‐
768 ments is compared according to the relative weights for the elements.
769 If the two strings compare equal, the process is repeated for the next
770 weight level, up to the limit {COLL_WEIGHTS_MAX}.
771
772
773 Weights are expressed as characters described in Locale Definition
774 above, <collating-symbol>s, <collating-element>s, an ellipsis, or the
775 special symbol IGNORE. A single character, a <collating-symbol> or a
776 <collating-element> represent the relative position in the character
777 collating sequence of the character or symbol, rather than the charac‐
778 ter or characters themselves. Thus, rather than assigning absolute val‐
779 ues to weights, a particular weight is expressed using the relative
780 order value assigned to a collating element based on its order in the
781 character collation sequence.
782
783
784 One-to-many mapping is indicated by specifying two or more concatenated
785 characters or symbolic names. For example, if the character <eszet> is
786 given the string "<s><s>" as a weight, comparisons are performed as if
787 all occurrences of the character <eszet> are replaced by <s><s> (assum‐
788 ing that <s> has the collating weight <s>). If it is necessary to
789 define <eszet> and <s><s> as an equivalence class, then a collating
790 element must be defined for the string ss.
791
792
793 All characters specified via an ellipsis will by default be assigned
794 unique weights, equal to the relative order of characters. Characters
795 specified via an explicit or implicit UNDEFINED special symbol will by
796 default be assigned the same primary weight (that is, belong to the
797 same equivalence class). An ellipsis symbol as a weight is interpreted
798 to mean that each character in the sequence has unique weights, equal
799 to the relative order of their character in the character collation
800 sequence. The use of the ellipsis as a weight is treated as an error if
801 the collating element is neither an ellipsis nor the special symbol
802 UNDEFINED.
803
804
805 The special keyword IGNORE as a weight indicates that when strings are
806 compared using the weights at the level where IGNORE is specified, the
807 collating element is ignored; that is, as if the string did not contain
808 the collating element. In regular expressions and pattern matching, all
809 characters that are subject to IGNORE in their primary weight form an
810 equivalence class.
811
812
813 An empty operand is interpreted as the collating element itself.
814
815
816 For example, the order statement:
817
818 <a> <a>;<a>
819
820
821
822
823 is equal to:
824
825 <a>
826
827
828
829
830 An ellipsis can be used as an operand if the collating element was an
831 ellipsis, and is interpreted as the value of each character defined by
832 the ellipsis.
833
834
835 The collation order as defined in this section defines the interpreta‐
836 tion of bracket expressions in regular expressions.
837
838
839 Example:
840
841
842
843
844 order_start forward;backward
845 UNDEFINED IGNORE;IGNORE
846 <LOW>
847 <space> <LOW>;<space>
848 ... <LOW>;...
849 <a> <a>;<a>
850 <a-acute> <a>;<a-acute>
851 <a-grave> <a>;<a-grave>
852 <A> <a>;<A>
853 <A-acute> <a>;<A-acute>
854 <A-grave> <a>;<A-grave>
855 <ch> <ch>;<ch>
856 <Ch> <ch>;<Ch>
857 <s> <s>;<s>
858 <eszet> "<s><s>";"<eszet><eszet>"
859 order_end
860
861
862
863 This example is interpreted as follows:
864
865 1. The UNDEFINED means that all characters not specified in
866 this definition (explicitly or via the ellipsis) are ignored
867 for collation purposes; for regular expression purposes they
868 are ordered first.
869
870 2. All characters between <space> and <a> have the same primary
871 equivalence class and individual secondary weights based on
872 their ordinal encoded values.
873
874 3. All characters based on the upper- or lower-case character a
875 belong to the same primary equivalence class.
876
877 4. The multi-character collating element <ch> is represented by
878 the collating symbol <ch> and belongs to the same primary
879 equivalence class as the multi-character collating element
880 <Ch>.
881
882 order_end keyword
883 The collating order entries must be terminated with an order_end key‐
884 word.
885
886 LC_MONETARY
887 The LC_MONETARY category defines the rules and symbols that are used
888 to format monetary numeric information. This information is available
889 through the localeconv(3C) function
890
891
892 The following items are defined in this category of the locale. The
893 item names are the keywords recognized by the localedef(1) utility when
894 defining a locale. They are also similar to the member names of the
895 lconv structure defined in <locale.h>. The localeconv function returns
896 {CHAR_MAX} for unspecified integer items and the empty string ("") for
897 unspecified or size zero string items.
898
899
900 In a locale definition file the operands are strings. For some key‐
901 words, the strings can contain only integers. Keywords that are not
902 provided, string values set to the empty string (""), or integer key‐
903 words set to -1, are used to indicate that the value is not available
904 in the locale.
905
906 int_curr_symbol The international currency symbol. The operand is
907 a four-character string, with the first three
908 characters containing the alphabetic interna‐
909 tional currency symbol in accordance with those
910 specified in the ISO 4217 standard. The fourth
911 character is the character used to separate the
912 international currency symbol from the monetary
913 quantity.
914
915
916 currency_symbol The string used as the local currency symbol.
917
918
919 mon_decimal_point The operand is a string containing the symbol
920 that is used as the decimal delimiter (radix
921 character) in monetary formatted quantities.
922
923
924 mon_thousands_sep The operand is a string containing the symbol
925 that is used as a separator for groups of digits
926 to the left of the decimal delimiter in formatted
927 monetary quantities.
928
929
930 mon_grouping Define the size of each group of digits in for‐
931 matted monetary quantities. The operand is a
932 sequence of integers separated by semicolons.
933 Each integer specifies the number of digits in
934 each group, with the initial integer defining the
935 size of the group immediately preceding the deci‐
936 mal delimiter, and the following integers defin‐
937 ing the preceding groups. If the last integer is
938 not -1, then the size of the previous group (if
939 any) will be repeatedly used for the remainder of
940 the digits. If the last integer is -1, then no
941 further grouping will be performed.
942
943 The following is an example of the interpretation
944 of the mon_grouping keyword. Assuming that the
945 value to be formatted is 123456789 and the
946 mon_thousands_sep is ', then the following table
947 shows the result. The third column shows the
948 equivalent string in the ISO C standard that
949 would be used by the localeconv function to
950 accommodate this grouping.
951
952 mon_grouping Formatted Value ISO C String
953
954 3;-1 123456'789 "\3\177"
955 3 123'456'789 "\3"
956 3;2;-1 1234'56'789 "\3\2\177"
957 3;2 12'34'56'789 "\3\2"
958 -1 1234567898 "\177"
959
960
961 In these examples, the octal value of {CHAR_MAX}
962 is 177.
963
964
965 positive_sign A string used to indicate a non-negative-valued
966 formatted monetary quantity.
967
968
969 negative_sign A string used to indicate a negative-valued for‐
970 matted monetary quantity.
971
972
973 int_frac_digits An integer representing the number of fractional
974 digits (those to the right of the decimal delim‐
975 iter) to be written in a formatted monetary quan‐
976 tity using int_curr_symbol.
977
978
979 frac_digits An integer representing the number of fractional
980 digits (those to the right of the decimal delim‐
981 iter) to be written in a formatted monetary quan‐
982 tity using currency_symbol.
983
984
985 p_cs_precedes In an application conforming to the SUSv3 stan‐
986 dard, an integer set to 1 if the currency_symbol
987 precedes the value for a monetary quantity with a
988 non-negative value, and set to 0 if the symbol
989 succeeds the value.
990
991 In an application not conforming to the SUSv3
992 standard, an integer set to 1 if the cur‐
993 rency_symbol or int_currency_symbol precedes the
994 value for a monetary quantity with a non-negative
995 value, and set to 0 if the symbol succeeds the
996 value.
997
998
999 p_sep_by_space In an application conforming to the SUSv3 stan‐
1000 dard, an integer set to 0 if no space separates
1001 the currency_symbol from the value for a monetary
1002 quantity with a non-negative value, set to 1 if a
1003 space separates the symbol from the value, and
1004 set to 2 if a space separates the symbol and the
1005 sign string, if adjacent.
1006
1007 In an application not conforming to the SUSv3
1008 standard, an integer set to 0 if no space sepa‐
1009 rates the currency_symbol or int_curr_symbol from
1010 the value for a monetary quantity with a non-neg‐
1011 ative value, set to 1 if a space separates the
1012 symbol from the value, and set to 2 if a space
1013 separates the symbol and the sign string, if
1014 adjacent.
1015
1016
1017 n_cs_precedes In an application conforming to the SUSv3 stan‐
1018 dard, an integer set to 1 if the currency_symbol
1019 precedes the value for a monetary quantity with a
1020 negative value, and set to 0 if the symbol suc‐
1021 ceeds the value.
1022
1023 In an application not conforming to the SUSv3
1024 standard, an integer set to 1 if the cur‐
1025 rency_symbol or int_currency_symbol precedes the
1026 value for a monetary quantity with a negative
1027 value, and set to 0 if the symbol succeeds the
1028 value.
1029
1030
1031 n_sep_by_space In an application conforming to the SUSv3 stan‐
1032 dard, an integer set to 0 if no space separates
1033 the currency_symbol from the value for a monetary
1034 quantity with a negative value, set to 1 if a
1035 space separates the symbol from the value, and
1036 set to 2 if a space separates the symbol and the
1037 sign string, if adjacent.
1038
1039 In an application not conforming to the SUSv3
1040 standard, an integer set to 0 if no space sepa‐
1041 rates the currency_symbol or int_curr_symbol from
1042 the value for a monetary quantity with a negative
1043 value, set to 1 if a space separates the symbol
1044 from the value, and set to 2 if a space separates
1045 the symbol and the sign string, if adjacent.
1046
1047
1048 p_sign_posn An integer set to a value indicating the posi‐
1049 tioning of the positive_sign for a monetary quan‐
1050 tity with a non-negative value. The following
1051 integer values are recognized for both
1052 p_sign_posn and n_sign_posn:
1053
1054 In an application conforming to the SUSv3 stan‐
1055 dard:
1056
1057 0 Parentheses enclose the quantity and the
1058 currency_symbol.
1059
1060
1061 1 The sign string precedes the quantity and
1062 the currency_symbol.
1063
1064
1065 2 The sign string succeeds the quantity and
1066 the currency_symbol.
1067
1068
1069 3 The sign string precedes the currency_sym‐
1070 bol.
1071
1072
1073 4 The sign string succeeds the currency_sym‐
1074 bol.
1075
1076 In an application not conforming to the SUSv3
1077 standard:
1078
1079 0 Parentheses enclose the quantity and the
1080 currency_symbol or int_curr_symbol.
1081
1082
1083 1 The sign string precedes the quantity and
1084 the currency_symbol or int_curr_symbol.
1085
1086
1087 2 The sign string succeeds the quantity and
1088 the currency_symbol or int_curr_symbol.
1089
1090
1091 3 The sign string precedes the currency_symbol
1092 or int_curr_symbol.
1093
1094
1095 4 The sign string succeeds the currency_symbol
1096 or int_curr_symbol.
1097
1098
1099
1100 n_sign_posn An integer set to a value indicating the posi‐
1101 tioning of the negative_sign for a negative for‐
1102 matted monetary quantity.
1103
1104
1105 int_p_cs_precedes An integer set to 1 if the int_curr_symbol pre‐
1106 cedes the value for a monetary quantity with a
1107 non-negative value, and set to 0 if the symbol
1108 succeeds the value.
1109
1110
1111 int_n_cs_precedes An integer set to 1 if the int_curr_symbol pre‐
1112 cedes the value for a monetary quantity with a
1113 negative value, and set to 0 if the symbol suc‐
1114 ceeds the value.
1115
1116
1117 int_p_sep_by_space An integer set to 0 if no space separates the
1118 int_curr_symbol from the value for a monetary
1119 quantity with a non-negative value, set to 1 if a
1120 space separates the symbol from the value, and
1121 set to 2 if a space separates the symbol and the
1122 sign string, if adjacent.
1123
1124
1125 int_n_sep_by_space An integer set to 0 if no space separates the
1126 int_curr_symbol from the value for a monetary
1127 quantity with a negative value, set to 1 if a
1128 space separates the symbol from the value, and
1129 set to 2 if a space separates the symbol and the
1130 sign string, if adjacent.
1131
1132
1133 int_p_sign_posn An integer set to a value indicating the posi‐
1134 tioning of the positive_sign for a positive mone‐
1135 tary quantity formatted with the international
1136 format. The following integer values are recog‐
1137 nized for int_p_sign_posn and int_n_sign_posn:
1138
1139 0 Parentheses enclose the quantity and the
1140 int_curr_symbol.
1141
1142
1143 1 The sign string precedes the quantity and
1144 the int_curr_symbol.
1145
1146
1147 2 The sign string precedes the quantity and
1148 the int_curr_symbol.
1149
1150
1151 3 The sign string precedes the int_curr_sym‐
1152 bol.
1153
1154
1155 4 The sign string succeeds the int_curr_sym‐
1156 bol.
1157
1158
1159
1160 int_n_sign_posn An integer set to a value indicating the posi‐
1161 tioning of the negative_sign for a negative mone‐
1162 tary quantity formatted with the international
1163 format.
1164
1165
1166
1167 The following table shows the result of various combinations:
1168
1169
1170
1171
1172 p_sep_by_space
1173 2 1 0
1174 p_cs_precedes= 1 p_sign_posn= 0 ($1.25) ($1.25) ($1.25)
1175 p_sign_posn= 1 +$1.25 +$1.25 +$1.25
1176 p_sign_posn= 2 $1.25+ $1.25+ $1.25+
1177 p_sign_posn= 3 +$1.25 +$1.25 +$1.25
1178 p_sign_posn= 4 $+1.25 $+1.25 $+1.25
1179 p_cs_precedes= 0 p_sign_posn= 0 (1.25 $) (1.25 $) (1.25$)
1180 p_sign_posn= 1 +1.25 $ +1.25 $ +1.25$
1181 p_sign_posn= 2 1.25$ + 1.25 $+ 1.25$+
1182 p_sign_posn= 3 1.25+ $ 1.25 +$ 1.25+$
1183 p_sign_posn= 4 1.25$ + 1.25 $+ 1.25$+
1184
1185
1186
1187 The monetary formatting definitions for the POSIX locale follow. The
1188 code listing depicts the localedef(1) input, the table representing the
1189 same information with the addition of localeconv(3C) and nl_lang‐
1190 info(3C) formats. All values are unspecified in the POSIX locale.
1191
1192 LC_MONETARY
1193 # This is the POSIX locale definition for
1194 # the LC_MONETARY category.
1195 #
1196 int_curr_symbol ""
1197 currency_symbol ""
1198 mon_decimal_point ""
1199 mon_thousands_sep ""
1200 mon_grouping -1
1201 positive_sign ""
1202 negative_sign ""
1203 int_frac_digits -1
1204 frac_digits -1
1205 p_cs_precedes -1
1206 p_sep_by_space -1
1207 n_cs_precedes -1
1208 n_sep_by_space -1
1209 p_sign_posn -1
1210 n_sign_posn -1
1211 int_p_cs_precedes -1
1212 int_p_sep_by_space -1
1213 int_n_cs_precedes -1
1214 int_n_sep_by_space -1
1215 int_p_sign_posn -1
1216 int_n_sign_posn -1
1217 #
1218 END LC_MONETARY
1219
1220
1221
1222
1223 The entry n/a indicates that the value is not available in the POSIX
1224 locale.
1225
1226 LC_NUMERIC
1227 The LC_NUMERIC category defines the rules and symbols that will be
1228 used to format non-monetary numeric information. This information is
1229 available through the localeconv(3C) function.
1230
1231
1232 The following items are defined in this category of the locale. The
1233 item names are the keywords recognized by the localedef utility when
1234 defining a locale. They are also similar to the member names of the
1235 lconv structure defined in <locale.h>. The localeconv() function
1236 returns {CHAR_MAX} for unspecified integer items and the empty string
1237 ("") for unspecified or size zero string items.
1238
1239
1240 In a locale definition file the operands are strings. For some key‐
1241 words, the strings only can contain integers. Keywords that are not
1242 provided, string values set to the empty string (""), or integer key‐
1243 words set to -1, will be used to indicate that the value is not avail‐
1244 able in the locale. The following keywords are recognized:
1245
1246 decimal_point The operand is a string containing the symbol that is
1247 used as the decimal delimiter (radix character) in
1248 numeric, non-monetary formatted quantities. This key‐
1249 word cannot be omitted and cannot be set to the empty
1250 string. In contexts where standards limit the deci‐
1251 mal_point to a single byte, the result of specifying a
1252 multi-byte operand is unspecified.
1253
1254
1255 thousands_sep The operand is a string containing the symbol that is
1256 used as a separator for groups of digits to the left
1257 of the decimal delimiter in numeric, non-monetary for‐
1258 matted monetary quantities. In contexts where stan‐
1259 dards limit the thousands_sep to a single byte, the
1260 result of specifying a multi-byte operand is unspeci‐
1261 fied.
1262
1263
1264 grouping Define the size of each group of digits in formatted
1265 non-monetary quantities. The operand is a sequence of
1266 integers separated by semicolons. Each integer speci‐
1267 fies the number of digits in each group, with the ini‐
1268 tial integer defining the size of the group immedi‐
1269 ately preceding the decimal delimiter, and the follow‐
1270 ing integers defining the preceding groups. If the
1271 last integer is not −1, then the size of the previous
1272 group (if any) will be repeatedly used for the remain‐
1273 der of the digits. If the last integer is -1, then no
1274 further grouping will be performed. The non-monetary
1275 numeric formatting definitions for the POSIX locale
1276 follow. The code listing depicts the localedef input,
1277 the table representing the same information with the
1278 addition of localeconv values, and nl_langinfo con‐
1279 stants.
1280
1281 LC_NUMERIC
1282 # This is the POSIX locale definition for
1283 # the LC_NUMERIC category.
1284 #
1285 decimal_point "<period>"
1286 thousands_sep ""
1287 grouping -1
1288 #
1289 END LC_NUMERIC
1290
1291
1292
1293
1294
1295
1296
1297 POSIX locale langinfo localeconv() localedef
1298 Item Value Constant Value Value
1299 ────────────────────────────────────────────────────────────────────────
1300 decimal_point "." RADIXCHAR "." .
1301 thousands_sep n/a THOUSEP "" ""
1302 grouping n/a - "" −1
1303
1304
1305
1306 The entry n/a indicates that the value is not available in the POSIX
1307 locale.
1308
1309 LC_TIME
1310 The LC_TIME category defines the interpretation of the field descrip‐
1311 tors supported by date(1) and affects the behavior of the strf‐
1312 time(3C), wcsftime(3C), strptime(3C), and nl_langinfo(3C) functions.
1313 Because the interfaces for C-language access and locale definition dif‐
1314 fer significantly, they are described separately. For locale defini‐
1315 tion, the following mandatory keywords are recognized:
1316
1317 abday Define the abbreviated weekday names, corresponding to
1318 the %a field descriptor (conversion specification in the
1319 strftime(), wcsftime(), and strptime() functions). The
1320 operand consists of seven semicolon-separated strings,
1321 each surrounded by double-quotes. The first string is
1322 the abbreviated name of the day corresponding to Sunday,
1323 the second the abbreviated name of the day corresponding
1324 to Monday, and so on.
1325
1326
1327 day Define the full weekday names, corresponding to the %A
1328 field descriptor. The operand consists of seven semi‐
1329 colon-separated strings, each surrounded by double-
1330 quotes. The first string is the full name of the day
1331 corresponding to Sunday, the second the full name of the
1332 day corresponding to Monday, and so on.
1333
1334
1335 abmon Define the abbreviated month names, corresponding to the
1336 %b field descriptor. The operand consists of twelve
1337 semicolon-separated strings, each surrounded by double-
1338 quotes. The first string is the abbreviated name of the
1339 first month of the year (January), the second the abbre‐
1340 viated name of the second month, and so on.
1341
1342
1343 mon Define the full month names, corresponding to the %B
1344 field descriptor. The operand consists of twelve semi‐
1345 colon-separated strings, each surrounded by double-
1346 quotes. The first string is the full name of the first
1347 month of the year (January), the second the full name of
1348 the second month, and so on.
1349
1350
1351 d_t_fmt Define the appropriate date and time representation,
1352 corresponding to the %c field descriptor. The operand
1353 consists of a string, and can contain any combination of
1354 characters and field descriptors. In addition, the
1355 string can contain the escape sequences \\, \a, \b, \f,
1356 \n, \r, \t, \v.
1357
1358
1359 date_fmt Define the appropriate date and time representation,
1360 corresponding to the %C field descriptor. The operand
1361 consists of a string, and can contain any combination of
1362 characters and field descriptors. In addition, the
1363 string can contain the escape sequences \\, \a, \b, \f,
1364 \n, \r, \t, \v.
1365
1366
1367 d_fmt Define the appropriate date representation, correspond‐
1368 ing to the %x field descriptor. The operand consists of
1369 a string, and can contain any combination of characters
1370 and field descriptors. In addition, the string can con‐
1371 tain the escape sequences \\, \a, \b, \f, \n, \r, \t,
1372 \v.
1373
1374
1375 t_fmt Define the appropriate time representation, correspond‐
1376 ing to the %X field descriptor. The operand consists of
1377 a string, and can contain any combination of characters
1378 and field descriptors. In addition, the string can con‐
1379 tain the escape sequences \\, \a, \b, \f, \n, \r, \t,
1380 \v.
1381
1382
1383 am_pm Define the appropriate representation of the ante meri‐
1384 diem and post meridiem strings, corresponding to the %p
1385 field descriptor. The operand consists of two strings,
1386 separated by a semicolon, each surrounded by double-
1387 quotes. The first string represents the ante meridiem
1388 designation, the last string the post meridiem designa‐
1389 tion.
1390
1391
1392 t_fmt_ampm Define the appropriate time representation in the
1393 12-hour clock format with am_pm, corresponding to the %r
1394 field descriptor. The operand consists of a string and
1395 can contain any combination of characters and field
1396 descriptors. If the string is empty, the 12-hour format
1397 is not supported in the locale.
1398
1399
1400 era Define how years are counted and displayed for each era
1401 in a locale. The operand consists of semicolon-separated
1402 strings. Each string is an era description segment with
1403 the format:
1404
1405 direction:offset:start_date:end_date:era_name:era_format
1406
1407 according to the definitions below. There can be as
1408 many era description segments as are necessary to
1409 describe the different eras.
1410
1411 The start of an era might not be the earliest point For
1412 example, the Christian era B.C. starts on the day before
1413 January 1, A.D. 1, and increases with earlier time.
1414
1415 direction Either a + or a - character. The + charac‐
1416 ter indicates that years closer to the
1417 start_date have lower numbers than those
1418 closer to the end_date. The - character
1419 indicates that years closer to the
1420 start_date have higher numbers than those
1421 closer to the end_date.
1422
1423
1424 offset The number of the year closest to the
1425 start_date in the era, corresponding to
1426 the %Eg and %Ey field descriptors.
1427
1428
1429 start_date A date in the form yyyy/mm/dd, where yyyy,
1430 mm, and dd are the year, month and day
1431 numbers respectively of the start of the
1432 era. Years prior to A.D. 1 are represented
1433 as negative numbers.
1434
1435
1436 end_date The ending date of the era, in the same
1437 format as the start_date, or one of the
1438 two special values -* or +*. The value -*
1439 indicates that the ending date is the
1440 beginning of time. The value +* indicates
1441 that the ending date is the end of time.
1442
1443
1444 era_name A string representing the name of the era,
1445 corresponding to the %EC field descriptor.
1446
1447
1448 era_format A string for formatting the year in the
1449 era, corresponding to the %EG and %EY
1450 field descriptors.
1451
1452
1453
1454 era_d_fmt Define the format of the date in alternative era nota‐
1455 tion, corresponding to the %Ex field descriptor.
1456
1457
1458 era_t_fmt Define the locale's appropriate alternative time format,
1459 corresponding to the %EX field descriptor.
1460
1461
1462 era_d_t_fmt Define the locale's appropriate alternative date and
1463 time format, corresponding to the %Ec field descriptor.
1464
1465
1466 alt_digits Define alternative symbols for digits, corresponding to
1467 the %O field descriptor modifier. The operand consists
1468 of semicolon-separated strings, each surrounded by dou‐
1469 ble-quotes. The first string is the alternative symbol
1470 corresponding with zero, the second string the symbol
1471 corresponding with one, and so on. Up to 100 alternative
1472 symbol strings can be specified. The %O modifier indi‐
1473 cates that the string corresponding to the value speci‐
1474 fied via the field descriptor will be used instead of
1475 the value.
1476
1477
1478 LC_TIME C-language Access
1479 The following information can be accessed. These correspond to con‐
1480 stants defined in <langinfo.h> and used as arguments to the nl_lang‐
1481 info(3C) function.
1482
1483 ABDAY_x The abbreviated weekday names (for example Sun), where x
1484 is a number from 1 to 7.
1485
1486
1487 DAY_x The full weekday names (for example Sunday), where x is
1488 a number from 1 to 7.
1489
1490
1491 ABMON_x The abbreviated month names (for example Jan), where x
1492 is a number from 1 to 12.
1493
1494
1495 MON_x The full month names (for example January), where x is a
1496 number from 1 to 12.
1497
1498
1499 D_T_FMT The appropriate date and time representation.
1500
1501
1502 D_FMT The appropriate date representation.
1503
1504
1505 T_FMT The appropriate time representation.
1506
1507
1508 AM_STR The appropriate ante-meridiem affix.
1509
1510
1511 PM_STR The appropriate post-meridiem affix.
1512
1513
1514 T_FMT_AMPM The appropriate time representation in the 12-hour clock
1515 format with AM_STR and PM_STR.
1516
1517
1518 ERA The era description segments, which describe how years
1519 are counted and displayed for each era in a locale. Each
1520 era description segment has the format:
1521
1522 direction:offset:start_date:end_date:era_name:era_format
1523
1524
1525 according to the definitions below. There will be as
1526 many era description segments as are necessary to
1527 describe the different eras. Era description segments
1528 are separated by semicolons.
1529
1530 The start of an era might not be the earliest point For
1531 example, the Christian era B.C. starts on the day before
1532 January 1, A.D. 1, and increases with earlier time.
1533
1534 direction Either a + or a - character. The + charac‐
1535 ter indicates that years closer to the
1536 start_date have lower numbers than those
1537 closer to the end_date. The - character
1538 indicates that years closer to the
1539 start_date have higher numbers than those
1540 closer to the end_date.
1541
1542
1543 offset The number of the year closest to the
1544 start_date in the era.
1545
1546
1547 start_date A date in the form yyyy/mm/dd, where yyyy,
1548 mm, and dd are the year, month and day
1549 numbers respectively of the start of the
1550 era. Years prior to AD 1 are represented
1551 as negative numbers.
1552
1553
1554 end_date The ending date of the era, in the same
1555 format as the start_date, or one of the
1556 two special values, -* or +*. The value -*
1557 indicates that the ending date is the
1558 beginning of time. The value +* indicates
1559 that the ending date is the end of time.
1560
1561
1562 era_name The era, corresponding to the %EC conver‐
1563 sion specification.
1564
1565
1566 era_format The format of the year in the era, corre‐
1567 sponding to the %EY and %EY conversion
1568 specifications.
1569
1570
1571
1572 ERA_D_FMT The era date format.
1573
1574
1575 ERA_T_FMT The locale's appropriate alternative time format, corre‐
1576 sponding to the %EX field descriptor.
1577
1578
1579 ERA_D_T_FMT The locale's appropriate alternative date and time for‐
1580 mat, corresponding to the %Ec field descriptor.
1581
1582
1583 ALT_DIGITS The alternative symbols for digits, corresponding to the
1584 %O conversion specification modifier. The value consists
1585 of semicolon-separated symbols. The first is the alter‐
1586 native symbol corresponding to zero, the second is the
1587 symbol corresponding to one, and so on. Up to 100
1588 alternative symbols may be specified. The following ta‐
1589 ble displays the correspondence between the items
1590 described above and the conversion specifiers used by
1591 date(1) and the strftime(3C), wcsftime(3C), and strp‐
1592 time(3C) functions.
1593
1594
1595
1596
1597
1598 ┌────────────────────┬────────────────────┬────────────────────┐
1599 │ localedef │ langinfo │ Conversion │
1600 │ Keyword │ Constant │ Specifier │
1601 ├────────────────────┼────────────────────┼────────────────────┤
1602 │ abday │ ABDAY_x │ %a │
1603 │ day │ DAY_x │ %A │
1604 │ abmon │ ABMON_x │ %b │
1605 │ mon │ MON │ %B │
1606 │ d_t_fmt │ D_T_FMT │ %c │
1607 │ date_fmt │ DATE_FMT │ %C │
1608 │ d_fmt │ D_FMT │ %x │
1609 │ t_fmt │ T_FMT │ %X │
1610 │ am_pm │ AM_STR │ %p │
1611 │ am_pm │ PM_STR │ %p │
1612 │ t_fmt_ampm │ T_FMT_AMPM │ %r │
1613 │ era │ ERA │ %EC, %Eg, │
1614 │ │ │ %EG, %Ey, %EY │
1615 │ era_d_fmt │ ERA_D_FMT │ %Ex │
1616 │ era_t_fmt │ ERA_T_FMT │ %EX │
1617 │ era_d_t_fmt │ ERA_D_T_FMT │ %Ec │
1618 │ alt_digits │ ALT_DIGITS │ %O │
1619 └────────────────────┴────────────────────┴────────────────────┘
1620
1621 LC_TIME General Information
1622 Although certain of the field descriptors in the POSIX locale (such as
1623 the name of the month) are shown with initial capital letters, this
1624 need not be the case in other locales. Programs using these fields may
1625 need to adjust the capitalization if the output is going to be used at
1626 the beginning of a sentence.
1627
1628
1629 The LC_TIME descriptions of abday, day, mon, and abmon imply a Grego‐
1630 rian style calendar (7-day weeks, 12-month years, leap years, and so
1631 forth). Formatting time strings for other types of calendars is outside
1632 the scope of this document set.
1633
1634
1635 As specified under date in Locale Definition and strftime(3C), the
1636 field descriptors corresponding to the optional keywords consist of a
1637 modifier followed by a traditional field descriptor (for instance %Ex).
1638 If the optional keywords are not supported by the implementation or are
1639 unspecified for the current locale, these field descriptors are treated
1640 as the traditional field descriptor. For instance, assume the following
1641 keywords:
1642
1643 alt_digits "0th" ; "1st" ; "2nd" ; "3rd" ; "4th" ; "5th" ; \
1644 "6th" ; "7th" ; "8th" ; "9th" ; "10th">
1645 d_fmt "The %Od day of %B in %Y"
1646
1647
1648
1649
1650 On 7/4/1776, the %x field descriptor would result in "The 4th day of
1651 July in 1776" while 7/14/1789 would come out as "The 14 day of July in
1652 1789" The above example is for illustrative purposes only. The %O modi‐
1653 fier is primarily intended to provide for Kanji or Hindi digits in date
1654 formats.
1655
1656 LC_MESSAGES
1657 The LC_MESSAGES category defines the format and values for affirmative
1658 and negative responses.
1659
1660
1661 The following keywords are recognized as part of the locale definition
1662 file. The nl_langinfo(3C) function accepts upper-case versions of the
1663 first four keywords.
1664
1665 yesexpr The operand consists of an extended regular expression (see
1666 regex(5)) that describes the acceptable affirmative response
1667 to a question expecting an affirmative or negative response.
1668
1669
1670 noexpr The operand consists of an extended regular expression that
1671 describes the acceptable negative response to a question
1672 expecting an affirmative or negative response.
1673
1674
1675 yesstr The operand consists of a fixed string (not a regular
1676 expression) that can be used by an application for composi‐
1677 tion of a message that lists an acceptable affirmative
1678 response, such as in a prompt.
1679
1680
1681 nostr The operand consists of a fixed string that can be used by
1682 an application for composition of a message that lists an
1683 acceptable negative response. The format and values for
1684 affirmative and negative responses of the POSIX locale fol‐
1685 low; the code listing depicting the localedef input, the ta‐
1686 ble representing the same information with the addition of
1687 nl_langinfo() constants.
1688
1689 LC_MESSAGES
1690 # This is the POSIX locale definition for
1691 # the LC_MESSAGES category.
1692 #
1693 yesexpr "<circumflex><left-square-bracket><y><Y>\
1694 <right-square-bracket>"
1695 #
1696 noexpr "<circumflex><left-square-bracket><n><N>\
1697 <right-square-bracket>"
1698 #
1699 yesstr "yes"
1700 nostr "no"
1701 END LC_MESSAGES
1702
1703
1704
1705
1706
1707
1708
1709 ┌────────────────────┬────────────────────┬────────────────────┐
1710 │localedef Keyword │langinfo Constant │ POSIX Locale Value │
1711 │yesexpr │YESEXPR │ "^[yY]" │
1712 │noexpr │NOEXPR │ "^[nN]" │
1713 │yesstr │YESSTR │ "yes" │
1714 │nostr │NOSTR │ "no" │
1715 └────────────────────┴────────────────────┴────────────────────┘
1716
1717
1718 In an application conforming to the SUSv3 standard, the information on
1719 yesstr and nostr is not available.
1720
1722 date(1), locale(1), localedef(1), sort(1), tr(1), uniq(1), locale‐
1723 conv(3C), nl_langinfo(3C), setlocale(3C), strcoll(3C), strftime(3C),
1724 strptime(3C), strxfrm(3C), wcscoll(3C), wcsftime(3C), wcsxfrm(3C),
1725 wctype(3C), attributes(5), charmap(5), extensions(5), regex(5)
1726
1727
1728
1729SunOS 5.11 1 Dec 2003 locale(5)