1NKF(3) User Contributed Perl Documentation NKF(3)
2
3
4
6 NKF - Perl extension for Network Kanji Filter
7
9 use NKF;
10 $output = nkf("-s",$input);
11
13 This is a Perl Extension version of nkf (Netowrk Kanji Filter). It
14 converts the last argument and return converted result. Conversion
15 details are specified by flags before the last argument.
16
17 Nkf is a yet another kanji code converter among networks, hosts and
18 terminals. It converts input kanji code to designated kanji code such
19 as ISO-2022-JP, Shift_JIS, EUC-JP, UTF-8 or UTF-16.
20
21 One of the most unique faculty of nkf is the guess of the input kanji
22 encodings. It currently recognizes ISO-2022-JP, Shift_JIS, EUC-JP,
23 UTF-8 and UTF-16. So users needn't set the input kanji code explic‐
24 itly.
25
26 By default, X0201 kana is converted into X0208 kana. For X0201 kana,
27 SO/SI, SSO and ESC-(-I methods are supported. For automatic code
28 detection, nkf assumes no X0201 kana in Shift_JIS. To accept X0201 in
29 Shift_JIS, use -X, -x or -S.
30
32 -b -u
33 Output is buffered (DEFAULT), Output is unbuffered.
34
35 -j -s -e -w -w16
36 Output code is ISO-2022-JP (7bit JIS), Shift_JIS, EUC-JP, UTF-8N,
37 UTF-16BE. Without this option and compile option, ISO-2022-JP is
38 assumed.
39
40 -J -S -E -W -W16
41 Input assumption is JIS 7 bit, Shift_JIS, EUC-JP, UTF-8, UTF-16LE.
42
43 -J Assume JIS input. It also accepts EUC-JP. This is the
44 default. This flag does not exclude Shift_JIS.
45
46 -S Assume Shift_JIS and X0201 kana input. It also accepts JIS.
47 EUC-JP is recognized as X0201 kana. Without -x flag, X0201 kana
48 (halfwidth kana) is converted into X0208.
49
50 -E Assume EUC-JP input. It also accepts JIS. Same as -J.
51
52 -t No conversion.
53
54 -i[@B]
55 Specify the Esc Seq for JIS X 0208-1978/83. (DEFAULT B)
56
57 -o[BJH]
58 Specify the Esc Seq for ASCII/Roman. (DEFAULT B)
59
60 -r {de/en}crypt ROT13/47
61
62 -h[123] --hiragana --katakana --katakana-hiragana
63 -h1 --hiragana
64 Katakana to Hiragana conversion.
65
66 -h2 --katakana
67 Hiragana to Katakana conversion.
68
69 -h3 --katakana-hiragana
70 Katakana to Hiragana and Hiragana to Katakana conversion.
71
72 -T Text mode output (MS-DOS)
73
74 -l ISO8859-1 (Latin-1) support
75
76 -f[m [- n]]
77 Folding on m length with n margin in a line. Without this option,
78 fold length is 60 and fold margin is 10.
79
80 -F New line preserving line folding.
81
82 -Z[0-3]
83 Convert X0208 alphabet (Fullwidth Alphabets) to ASCII.
84
85 -Z -Z0
86 Convert X0208 alphabet to ASCII.
87
88 -Z1 Converts X0208 kankaku to single ASCII space.
89
90 -Z2 Converts X0208 kankaku to double ASCII spaces.
91
92 -Z3 Replacing Fullwidth >, <, ", & into '>', '<', '"',
93 '&' as in HTML.
94
95 -X -x
96 Assume X0201 kana in MS-Kanji. With -X or without this option,
97 X0201 is converted into X0208 Kana. With -x, try to preserve X0208
98 kana and do not convert X0201 kana to X0208. In JIS output,
99 ESC-(-I is used. In EUC output, SSO is used.
100
101 -B[0-2]
102 Assume broken JIS-Kanji input, which lost ESC. Useful when your
103 site is using old B-News Nihongo patch.
104
105 -B1 allows any char after ESC-( or ESC-$.
106
107 -B2 forces ASCII after NL.
108
109 -I Replacing non iso-2022-jp char into a geta character (substitute
110 character in Japanese).
111
112 -m[BQN0]
113 MIME ISO-2022-JP/ISO8859-1 decode. (DEFAULT) To see ISO8859-1
114 (Latin-1) -l is necessary.
115
116 -mB Decode MIME base64 encoded stream. Remove header or other part
117 before conversion.
118
119 -mQ Decode MIME quoted stream. '_' in quoted stream is converted to
120 space.
121
122 -mN Non-strict decoding. It allows line break in the middle of the
123 base64 encoding.
124
125 -m0 No MIME decode.
126
127 -M MIME encode. Header style. All ASCII code and control characters
128 are intact.
129
130 -MB MIME encode Base64 stream. Kanji conversion is performed
131 before encoding, so this cannot be used as a picture encoder.
132
133 -MQ Perfome quoted encoding.
134
135 -l Input and output code is ISO8859-1 (Latin-1) and ISO-2022-JP. -s,
136 -e and -x are not compatible with this option.
137
138 -L[uwm] -d -c
139 Convert line breaks.
140
141 -Lu -d
142 unix (LF)
143
144 -Lw -c
145 windows (CRLF)
146
147 -Lm mac (CR)
148
149 Without this option, nkf doesn't convert line breaks.
150
151 --fj --unix --mac --msdos --windows
152 convert for these system
153
154 --jis --euc --sjis --mime --base64
155 convert for named code
156
157 --jis-input --euc-input --sjis-input --mime-input --base64-input
158 assume input system
159
160 --ic=input codeset --oc=output codeset
161 Set the input or output codeset. NKF supports following codesets
162 and those codeset name are case insensitive.
163
164 ISO-2022-JP
165 a.k.a. RFC1468, 7bit JIS, JUNET
166
167 EUC-JP (eucJP-nkf)
168 a.k.a. AT&T JIS, Japanese EUC, UJIS
169
170 eucJP-ascii
171 eucJP-ms
172 CP51932
173 Microsoft Version of EUC-JP.
174
175 Shift_JIS
176 a.k.a. SJIS, MS-Kanji
177
178 CP932
179 a.k.a. Windows-31J
180
181 UTF-8
182 same as UTF-8N
183
184 UTF-8N
185 UTF-8 without BOM
186
187 UTF-8-BOM
188 UTF-8 with BOM
189
190 UTF-16
191 same as UTF-16BE
192
193 UTF-16BE
194 UTF-16 Big Endian without BOM
195
196 UTF-16BE-BOM
197 UTF-16 Big Endian with BOM
198
199 UTF-16LE
200 UTF-16 Little Endian without BOM
201
202 UTF-16LE-BOM
203 UTF-16 Little Endian with BOM
204
205 UTF8-MAC (input only)
206 --fb-{skip, html, xml, perl, java, subchar}
207 Specify the way that nkf handles unassigned characters. Without
208 this option, --fb-skip is assumed.
209
210 --prefix=escape charactertarget character..
211 When nkf converts to Shift_JIS, nkf adds a specified escape charac‐
212 ter to specified 2nd byte of Shift_JIS characters. 1st byte of
213 argument is the escape character and following bytes are target
214 characters.
215
216 --no-cp932ext
217 Handle the characters extended in CP932 as unassigned characters.
218
219 --no-best-fit-chars
220 When Unicode to Encoded byte conversion, don't convert characters
221 which is not round trip safe. When Unicode to Unicode conversion,
222 with this and -x option, nkf can be used as UTF converter. (In
223 other words, without this and -x option, nkf doesn't save some
224 characters)
225
226 When nkf convert string which related to path, you should use this
227 opion.
228
229 --cap-input
230 Decode hex encoded characters.
231
232 --url-input
233 Unescape percent escaped characters.
234
235 --numchar-input
236 Decode character reference, such as "&#....;".
237
238 -- Ignore rest of -option.
239
241 Copyright (C) 1987, FUJITSU LTD. (I.Ichikawa),2000 S. Kono, COW Copy‐
242 right (C) 2002-2006 Kono, Furukawa, Naruse, mastodon
243
245 perl(1). nkf(1)
246
247
248
249perl v5.8.8 2006-06-19 NKF(3)