1Encode::TW(3pm)        Perl Programmers Reference Guide        Encode::TW(3pm)
2
3
4

NAME

6       Encode::TW - Taiwan-based Chinese Encodings
7

SYNOPSIS

9           use Encode qw/encode decode/;
10           $big5 = encode("big5", $utf8); # loads Encode::TW implicitly
11           $utf8 = decode("big5", $big5); # ditto
12

DESCRIPTION

14       This module implements tradition Chinese charset encodings as used in
15       Taiwan and Hong Kong.  Encodings supported are as follows.
16
17         Canonical   Alias             Description
18         --------------------------------------------------------------------
19         big5-eten   /\bbig-?5$/i      Big5 encoding (with ETen extensions)
20                 /\bbig5-?et(en)?$/i
21                 /\btca-?big5$/i
22         big5-hkscs  /\bbig5-?hk(scs)?$/i
23                     /\bhk(scs)?-?big5$/i
24                                       Big5 + Cantonese characters in Hong Kong
25         MacChineseTrad                Big5 + Apple Vendor Mappings
26         cp950                         Code Page 950
27                                       = Big5 + Microsoft vendor mappings
28         --------------------------------------------------------------------
29
30       To find out how to use this module in detail, see Encode.
31

NOTES

33       Due to size concerns, "EUC-TW" (Extended Unix Character), "CCCII"
34       (Chinese Character Code for Information Interchange), "BIG5PLUS"
35       (CMEX's Big5+) and "BIG5EXT" (CMEX's Big5e) are distributed separately
36       on CPAN, under the name Encode::HanExtra. That module also contains
37       extra China-based encodings.
38

BUGS

40       Since the original "big5" encoding (1984) is not supported anywhere
41       (glibc and DOS-based systems uses "big5" to mean "big5-eten"; Microsoft
42       uses "big5" to mean "cp950"), a conscious decision was made to alias
43       "big5" to "big5-eten", which is the de facto superset of the original
44       big5.
45
46       The "CNS11643" encoding files are not complete. For common "CNS11643"
47       manipulation, please use "EUC-TW" in Encode::HanExtra, which contains
48       planes 1-7.
49
50       The ASCII region (0x00-0x7f) is preserved for all encodings, even
51       though this conflicts with mappings by the Unicode Consortium.  See
52
53       <http://www.debian.or.jp/~kubota/unicode-symbols.html.en>
54
55       to find out why it is implemented that way.
56

SEE ALSO

58       Encode
59
60
61
62perl v5.10.1                      2009-02-12                   Encode::TW(3pm)
Impressum