1Util(3)               User Contributed Perl Documentation              Util(3)
2
3
4

NAME

6       Lingua::KO::Hangul::Util - utility functions for Hangul in Unicode
7

SYNOPSIS

9         use Lingua::KO::Hangul::Util qw(:all);
10
11         decomposeSyllable("\x{AC00}");          # "\x{1100}\x{1161}"
12         composeSyllable("\x{1100}\x{1161}");    # "\x{AC00}"
13         decomposeJamo("\x{1101}");              # "\x{1100}\x{1100}"
14         composeJamo("\x{1100}\x{1100}");        # "\x{1101}"
15
16         getHangulName(0xAC00);                  # "HANGUL SYLLABLE GA"
17         parseHangulName("HANGUL SYLLABLE GA");  # 0xAC00
18

DESCRIPTION

20       A Hangul syllable consists of Hangul jamo (Hangul letters).
21
22       Hangul letters are classified into three classes:
23
24         CHOSEONG  (the initial sound) as a leading consonant (L),
25         JUNGSEONG (the medial sound)  as a vowel (V),
26         JONGSEONG (the final sound)   as a trailing consonant (T).
27
28       Any Hangul syllable is a composition of (i) L + V, or (ii) L + V + T.
29
30   Composition and Decomposition
31       "$resultant_string = decomposeSyllable($string)"
32           It decomposes a precomposed syllable ("LV" or "LVT") to a sequence
33           of conjoining jamo ("L + V" or "L + V + T") and returns the result
34           as a string.
35
36           Any characters other than Hangul syllables are not affected.
37
38       "$resultant_string = composeSyllable($string)"
39           It composes a sequence of conjoining jamo ("L + V" or "L + V + T")
40           to a precomposed syllable ("LV" or "LVT") if possible, and returns
41           the result as a string.  A syllable "LV" and final jamo "T" are
42           also composed.
43
44           Any characters other than Hangul jamo and syllables are not
45           affected.
46
47       "$resultant_string = decomposeJamo($string)"
48           It decomposes a complex jamo to a sequence of simple jamo if
49           possible, and returns the result as a string.  Any characters other
50           than complex jamo are not affected.
51
52             e.g.
53                 CHOSEONG SIOS-PIEUP to CHOSEONG SIOS + PIEUP
54                 JUNGSEONG AE        to JUNGSEONG A + I
55                 JUNGSEONG WE        to JUNGSEONG U + EO + I
56                 JONGSEONG SSANGSIOS to JONGSEONG SIOS + SIOS
57
58       "$resultant_string = composeJamo($string)"
59           It composes a sequence of simple jamo ("L1 + L2", "V1 + V2 + V3",
60           etc.)  to a complex jamo if possible, and returns the result as a
61           string.  Any characters other than simple jamo are not affected.
62
63             e.g.
64                 CHOSEONG SIOS + PIEUP to CHOSEONG SIOS-PIEUP
65                 JUNGSEONG A + I       to JUNGSEONG AE
66                 JUNGSEONG U + EO + I  to JUNGSEONG WE
67                 JONGSEONG SIOS + SIOS to JONGSEONG SSANGSIOS
68
69       "$resultant_string = decomposeFull($string)"
70           It decomposes a syllable/complex jamo to a sequence of simple jamo.
71           Equivalent to "decomposeJamo(decomposeSyllable($string))".
72
73   Composition and Decomposition (Old-interface, deprecated!)
74       "$string_decomposed = decomposeHangul($code_point)"
75       "@codepoints = decomposeHangul($code_point)"
76           If the specified code point is of a Hangul syllable, it returns a
77           list of code points (in a list context) or a string (in a scalar
78           context) of its decomposition.
79
80              decomposeHangul(0xAC00) # U+AC00 is HANGUL SYLLABLE GA.
81                 returns "\x{1100}\x{1161}" or (0x1100, 0x1161);
82
83              decomposeHangul(0xAE00) # U+AE00 is HANGUL SYLLABLE GEUL.
84                 returns "\x{1100}\x{1173}\x{11AF}" or (0x1100, 0x1173, 0x11AF);
85
86           Otherwise, returns false (empty string or empty list).
87
88              decomposeHangul(0x0041) # outside Hangul syllables
89                 returns empty string or empty list.
90
91       "$string_composed = composeHangul($src_string)"
92       "@code_points_composed = composeHangul($src_string)"
93           Any sequence of an initial jamo "L" and a medial jamo "V" is
94           composed to a syllable "LV"; then any sequence of a syllable "LV"
95           and a final jamo "T" is composed to a syllable "LVT".
96
97           Any characters other than Hangul jamo and syllables are not
98           affected.
99
100              composeHangul("\x{1100}\x{1173}\x{11AF}.")
101              # returns "\x{AE00}." or (0xAE00,0x2E);
102
103       "$code_point_composite = getHangulComposite($code_point_here,
104       $code_point_next)"
105           It returns the codepoint of the composite if both two code points,
106           $code_point_here and $code_point_next, are in Hangul, and
107           composable.
108
109           Otherwise, returns "undef".
110
111   Hangul Syllable Name
112       The following functions handle only a precomposed Hangul syllable (from
113       "U+AC00" to "U+D7A3"), but not a Hangul jamo or other Hangul-related
114       character.
115
116       Names of Hangul syllables have a format of "HANGUL SYLLABLE %s".
117
118       "$name = getHangulName($code_point)"
119           If the specified code point is of a Hangul syllable, it returns its
120           name; otherwise it returns undef.
121
122              getHangulName(0xAC00) returns "HANGUL SYLLABLE GA";
123              getHangulName(0x0041) returns undef.
124
125       "$codepoint = parseHangulName($name)"
126           If the specified name is of a Hangul syllable, it returns its code
127           point; otherwise it returns undef.
128
129              parseHangulName("HANGUL SYLLABLE GEUL") returns 0xAE00;
130
131              parseHangulName("LATIN SMALL LETTER A") returns undef;
132
133              parseHangulName("HANGUL SYLLABLE PERL") returns undef;
134               # Regrettably, HANGUL SYLLABLE PERL does not exist :-)
135
136   Standard Korean Syllable Block
137       Standard Korean syllable block consists of "L+ V+ T*" (a sequence of
138       one or more L, one or more V, and zero or more T) according to
139       conjoining jamo behabior revised in Unicode 3.2 (cf. UAX #28).  A
140       sequence of "L" followed by "T" is not a syllable block without "V",
141       but consists of two nonstandard syllable blocks: one without "V", and
142       another without "L" and "V".
143
144       "$bool = isStandardForm($string)"
145           It returns boolean whether the string is encoded in the standard
146           form without a nonstandard sequence. It returns true only if the
147           string contains no nonstandard sequence.
148
149       "$resultant_string = insertFiller($string)"
150           It transforms the string into standard form by inserting fillers
151           into each syllables and returns the result as a string.  Choseong
152           filler ("Lf", "U+115F") is inserted into a syllable block without
153           "L". Jungseong filler ("Vf", "U+1160") is inserted into a syllable
154           block without "V".
155
156       "$type = getSyllableType($code_point)"
157           It returns the Hangul syllable type (cf. HangulSyllableType.txt)
158           for the specified code point as a string: "L" for leading jamo, "V"
159           for vowel jamo, "T" for trailing jamo, "LV" for LV syllables, "LVT"
160           for LVT syllables, and "NA" for other code points (as Not
161           Applicable).
162

EXPORT

164       By default:
165
166           decomposeHangul
167           composeHangul
168           getHangulName
169           parseHangulName
170           getHangulComposite
171
172       On request:
173
174           decomposeSyllable
175           composeSyllable
176           decomposeJamo
177           composeJamo
178           decomposeFull
179           isStandardForm
180           insertFiller
181           getSyllableType
182

CAVEAT

184       This module does not support Hangul jamo assigned in Unicode 5.2.0
185       (2009).
186
187       A list of Hangul charcters this module supports:
188
189           1100..1159 ; 1.1 # [90] HANGUL CHOSEONG KIYEOK..HANGUL CHOSEONG YEORINHIEUH
190           115F..11A2 ; 1.1 # [68] HANGUL CHOSEONG FILLER..HANGUL JUNGSEONG SSANGARAEA
191           11A8..11F9 ; 1.1 # [82] HANGUL JONGSEONG KIYEOK..HANGUL JONGSEONG YEORINHIEUH
192           AC00..D7A3 ; 2.0 # [11172] HANGUL SYLLABLE GA..HANGUL SYLLABLE HIH
193

AUTHOR

195       SADAHIRO Tomoyuki <SADAHIRO@cpan.org>
196
197       Copyright(C) 2001, 2003, 2005, SADAHIRO Tomoyuki. Japan.  All rights
198       reserved.
199
200       This module is free software; you can redistribute it and/or modify it
201       under the same terms as Perl itself.
202

SEE ALSO

204       Unicode Normalization Forms (UAX #15)
205           <http://www.unicode.org/reports/tr15/>
206
207       Conjoining Jamo Behavior (revision) in UAX #28
208           <http://www.unicode.org/reports/tr28/#3_11_conjoining_jamo_behavior>
209
210       Hangul Syllable Type
211           <http://www.unicode.org/Public/UNIDATA/HangulSyllableType.txt>
212
213       Jamo Decomposition in Old Unicode
214           <http://www.unicode.org/Public/2.1-Update3/UnicodeData-2.1.8.txt>
215
216       ISO/IEC JTC1/SC22/WG20 N954
217           Paper by K. KIM: New canonical decomposition and composition
218           processes for Hangeul
219
220           <http://std.dkuug.dk/JTC1/SC22/WG20/docs/N954.PDF>
221
222           (summary: <http://std.dkuug.dk/JTC1/SC22/WG20/docs/N953.PDF>) (cf.
223           <http://std.dkuug.dk/JTC1/SC22/WG20/docs/documents.html>)
224
225
226
227perl v5.36.0                      2023-01-20                           Util(3)
Impressum