1NKF(3)                User Contributed Perl Documentation               NKF(3)
2
3
4

NAME

6       NKF - Perl extension for Network Kanji Filter
7

SYNOPSIS

9         use NKF;
10         $output = nkf("-s",$input);
11

DESCRIPTION

13       This is a Perl Extension version of nkf (Netowrk Kanji Filter).  It
14       converts the last argument and return converted result. Conversion
15       details are specified by flags before the last argument.
16
17       Nkf is a yet another kanji code converter among networks, hosts and
18       terminals.  It converts input kanji code to designated kanji code such
19       as ISO-2022-JP, Shift_JIS, EUC-JP, UTF-8 or UTF-16.
20
21       One of the most unique faculty of nkf is the guess of the input kanji
22       encodings.  It currently recognizes ISO-2022-JP, Shift_JIS, EUC-JP,
23       UTF-8 and UTF-16.  So users needn't set the input kanji code explic‐
24       itly.
25
26       By default, X0201 kana is converted into X0208 kana.  For X0201 kana,
27       SO/SI, SSO and ESC-(-I methods are supported.  For automatic code
28       detection, nkf assumes no X0201 kana in Shift_JIS.  To accept X0201 in
29       Shift_JIS, use -X, -x or -S.
30

OPTIONS

32       -b -u
33           Output is buffered (DEFAULT), Output is unbuffered.
34
35       -j -s -e -w -w16
36           Output code is ISO-2022-JP (7bit JIS), Shift_JIS, EUC-JP, UTF-8N,
37           UTF-16BE.  Without this option and compile option, ISO-2022-JP is
38           assumed.
39
40       -J -S -E -W -W16
41           Input assumption is JIS 7 bit, Shift_JIS, EUC-JP, UTF-8, UTF-16LE.
42
43           -J  Assume  JIS input. It also accepts EUC-JP.  This is the
44               default. This flag does not exclude Shift_JIS.
45
46           -S  Assume Shift_JIS and X0201 kana input. It also accepts JIS.
47               EUC-JP is recognized as X0201 kana. Without -x flag, X0201 kana
48               (halfwidth kana) is converted into X0208.
49
50           -E  Assume EUC-JP input. It also accepts JIS.  Same as -J.
51
52       -t  No conversion.
53
54       -i[@B]
55           Specify the Esc Seq for JIS X 0208-1978/83. (DEFAULT B)
56
57       -o[BJH]
58           Specify the Esc Seq for ASCII/Roman. (DEFAULT B)
59
60       -r  {de/en}crypt ROT13/47
61
62       -h[123] --hiragana --katakana --katakana-hiragana
63           -h1 --hiragana
64               Katakana to Hiragana conversion.
65
66           -h2 --katakana
67               Hiragana to Katakana conversion.
68
69           -h3 --katakana-hiragana
70               Katakana to Hiragana and Hiragana to Katakana conversion.
71
72       -T  Text mode output (MS-DOS)
73
74       -l  ISO8859-1 (Latin-1) support
75
76       -f[m [- n]]
77           Folding on m length with n margin in a line.  Without this option,
78           fold length is 60 and fold margin is 10.
79
80       -F  New line preserving line folding.
81
82       -Z[0-3]
83           Convert X0208 alphabet (Fullwidth Alphabets) to ASCII.
84
85           -Z -Z0
86               Convert X0208 alphabet to ASCII.
87
88           -Z1 Converts X0208 kankaku to single ASCII space.
89
90           -Z2 Converts X0208 kankaku to double ASCII spaces.
91
92           -Z3 Replacing Fullwidth >, <, ", & into '&gt;', '&lt;', '&quot;',
93               '&amp;' as in HTML.
94
95       -X -x
96           Assume X0201 kana in MS-Kanji.  With -X or without this option,
97           X0201 is converted into X0208 Kana.  With -x, try to preserve X0208
98           kana and do not convert X0201 kana to X0208.  In JIS output,
99           ESC-(-I is used. In EUC output, SSO is used.
100
101       -B[0-2]
102           Assume broken JIS-Kanji input, which lost ESC.  Useful when your
103           site is using old B-News Nihongo patch.
104
105           -B1 allows any char after ESC-( or ESC-$.
106
107           -B2 forces ASCII after NL.
108
109       -I  Replacing non iso-2022-jp char into a geta character (substitute
110           character in Japanese).
111
112       -m[BQN0]
113           MIME ISO-2022-JP/ISO8859-1 decode. (DEFAULT) To see ISO8859-1
114           (Latin-1) -l is necessary.
115
116           -mB Decode MIME base64 encoded stream. Remove header or other part
117               before conversion.
118
119           -mQ Decode MIME quoted stream. '_' in quoted stream is converted to
120               space.
121
122           -mN Non-strict decoding.  It allows line break in the middle of the
123               base64 encoding.
124
125           -m0 No MIME decode.
126
127       -M  MIME encode. Header style. All ASCII code and control characters
128           are intact.
129
130           -MB MIME encode Base64 stream.  Kanji conversion is performed
131               before encoding, so this cannot be used as a picture encoder.
132
133           -MQ Perfome quoted encoding.
134
135       -l  Input and output code is ISO8859-1 (Latin-1) and ISO-2022-JP.  -s,
136           -e and -x are not compatible with this option.
137
138       -L[uwm] -d -c
139           Convert line breaks.
140
141           -Lu -d
142               unix (LF)
143
144           -Lw -c
145               windows (CRLF)
146
147           -Lm mac (CR)
148
149               Without this option, nkf doesn't convert line breaks.
150
151       --fj --unix --mac --msdos --windows
152           convert for these system
153
154       --jis --euc --sjis --mime --base64
155           convert for named code
156
157       --jis-input --euc-input --sjis-input --mime-input --base64-input
158           assume input system
159
160       --ic=input codeset --oc=output codeset
161           Set the input or output codeset.  NKF supports following codesets
162           and those codeset name are case insensitive.
163
164           ISO-2022-JP
165               a.k.a. RFC1468, 7bit JIS, JUNET
166
167           EUC-JP (eucJP-nkf)
168               a.k.a. AT&T JIS, Japanese EUC, UJIS
169
170           eucJP-ascii
171           eucJP-ms
172           CP51932
173               Microsoft Version of EUC-JP.
174
175           Shift_JIS
176               a.k.a. SJIS, MS-Kanji
177
178           CP932
179               a.k.a. Windows-31J
180
181           UTF-8
182               same as UTF-8N
183
184           UTF-8N
185               UTF-8 without BOM
186
187           UTF-8-BOM
188               UTF-8 with BOM
189
190           UTF-16
191               same as UTF-16BE
192
193           UTF-16BE
194               UTF-16 Big Endian without BOM
195
196           UTF-16BE-BOM
197               UTF-16 Big Endian with BOM
198
199           UTF-16LE
200               UTF-16 Little Endian without BOM
201
202           UTF-16LE-BOM
203               UTF-16 Little Endian with BOM
204
205           UTF8-MAC (input only)
206       --fb-{skip, html, xml, perl, java, subchar}
207           Specify the way that nkf handles unassigned characters.  Without
208           this option, --fb-skip is assumed.
209
210       --prefix=escape charactertarget character..
211           When nkf converts to Shift_JIS, nkf adds a specified escape charac‐
212           ter to specified 2nd byte of Shift_JIS characters.  1st byte of
213           argument is the escape character and following bytes are target
214           characters.
215
216       --no-cp932ext
217           Handle the characters extended in CP932 as unassigned characters.
218
219       --no-best-fit-chars
220           When Unicode to Encoded byte conversion, don't convert characters
221           which is not round trip safe.  When Unicode to Unicode conversion,
222           with this and -x option, nkf can be used as UTF converter.  (In
223           other words, without this and -x option, nkf doesn't save some
224           characters)
225
226           When nkf convert string which related to path, you should use this
227           opion.
228
229       --cap-input
230           Decode hex encoded characters.
231
232       --url-input
233           Unescape percent escaped characters.
234
235       --numchar-input
236           Decode character reference, such as "&#....;".
237
238       --  Ignore rest of -option.
239

AUTHOR

241       Copyright (C) 1987, FUJITSU LTD. (I.Ichikawa),2000 S. Kono, COW Copy‐
242       right (C) 2002-2006 Kono, Furukawa, Naruse, mastodon
243

SEE ALSO

245       perl(1).   nkf(1)
246
247
248
249perl v5.8.8                       2006-06-19                            NKF(3)
Impressum