skf(1) - f31

1SKF(1)                      General Commands Manual                     SKF(1)
2
3
4

NAME

6       skf - simple Kanji Filter (v2.1)
7

SYNOPSIS

9       skf [-EIJKNQRSXZbehjknqrsuvxz] [ long_format_options ] [infiles..]
10

DESCRIPTION

12       skf  is  a  yet another i18n capable kanji-filter, designed for reading
13       various CJK-coded files on the Net.  skf converts input kanji texts  or
14       streams  into  a  character  stream using designated codeset and output
15       them to standard output. Specifically, skf is designed to be  a  versa‐
16       tile  filter  to read documents in various code sets, and does not pro‐
17       vide features not related to code conversion.
18
19       Like nkf, skf automatically recognizes an input file code when it is  a
20       kind  of ISO-2022 compliant code, and also detects EUC-variant codes if
21       input file is Japanese text without X 0201 kanas.   skf  2.1  can  read
22       various iso-2022 compliant character sets, including JIS Kanji codes (X
23       0208, X 0212 and X 0213), EUC encoding (euc-jp (with X  0213  support),
24       euc-cn,  euc-kr  and  euc-tw),  ISO  Europian latins (ISO-8859-1 to 11,
25       13/14/15/16) and many regional character sets.  skf can also read  some
26       non-iso2022   compliant   sets,  including  Microsoft  Shift-JIS  code,
27       KOI-8-R/U, GB2312 (HZ), big5, VISCII(rfc1456,  include  VIQR),  Unicode
28       standard  (UCS2/UTF-16,  UTF7  and  UTF8),  some of MS codesets (cp1250
29       etc.) and some other vendor specific codes (KEIS83, JEF etc).
30
31       Supported output character sets of skf  are  more  limited,  but  still
32       include  X  0208/X 0212/X 0213 JIS, X 0201 JIS, ASCII, Microsoft Shift-
33       JIS, EUC-jp/-kr/-cn, HZ, iso-2022-jp/kr, big5, VISCII and Unicode.
34
35       skf also provides some basic decoding features for some  common  encod‐
36       ings including MIME, Punycode and URI codepoint.  Unicode decomposition
37       feature is also supported since 1.96.
38
39       As noted above, skf is designed to convert input text into some kind of
40       human-readable  forms under a local environment (i.e. codeset), and has
41       several extra conversion features like GNU recode type  folding.   Such
42       conversions  include  Windows/Macintosh specific code swaps and old-new
43       jis glyph changes, html-format/TeX format conversion and variant unifi‐
44       cations.
45
46       skf also can be compiled as an extension of some lightweight languages.
47       See README.txt for details.
48
49       If one or more file names are given, skf read the files and output con‐
50       verted  stream  to  stdout.  If no file names are given, input is taken
51       from stdin and output is also stdout.  OPTIONS are taken from  environ‐
52       ment  variables  SKFENV,  skfenv and command line, respectively in this
53       order. Environment variables are not used when  skf  is  running  as  a
54       priviledged  user.   skf  does not use LOCALE-related environment vari‐
55       ables for conversions, but output  error  messages  are  controlled  by
56       given LOCALES.
57

CODESET OPTIONS

59       skf  is  written  from scratch, and inherits no code from nkf. However,
60       skf is intended to be a drop-in replacement for  nkf(v1.4)  and  has  a
61       similar commonly-used nkf option set.
62       skf  2.1  recognizes  following  options.  Defaults  are all off if not
63       explicitly specified.
64
65   buffering control
66       -b     use buffered output. This is default.
67
68       -u     use unbuffered output.  Code detection feature is disabled  when
69              this option is on.
70
71   Input/Output codeset options
72       --ic=  input_code_set
73              specify  input  codeset  is input_code_set.  Possible candidates
74              are shown below.
75
76       --oc=  output_code_set
77              specify output codeset is output_code_set.  Possible  candidates
78              are shown below. Default codeset in distribution package is euc-
79              jp, but depends on compile option. Default codeset is shown by
80
81     Supported codeset
82       skf recognizes following codesets as  an  input/output  codeset.  These
83       codeset  names  are  case  insensitive,  and minus ('-') and underscore
84       ('_') is ignored.  Note that iso-2022 escape-based input codeset  (reg‐
85       istered  to  IANA)  is recoginized automatically, even when non-iso2022
86       codeset (except Unicode and B-Right/V) is specified.   o  in  in-column
87       means named codeset can be specified as input and x means named codeset
88       is not for input. output-column is same except it is for output.
89
90       in out  name            description
91       o  o    iso8859-1       ascii + iso-8859-1 (latin-1)
92       o  o    iso8859-2       ascii + iso-8859-2 (latin-2)
93       o  o    iso8859-3       ascii + iso-8859-3 (latin-3)
94       o  o    iso8859-4       ascii + iso-8859-4 (latin-4)
95       o  o    iso8859-5       ascii + iso-8859-5 (Cyrillic)
96       o  o    iso8859-6       ascii + iso-8859-6 (Arabic)
97       o  o    iso8859-7       ascii + iso-8859-7 (Greek)
98       o  o    iso8859-8       ascii + iso-8859-8 (Hebrew)
99       o  o    iso8859-9       ascii + iso-8859-9 (latin-5)
100       o  o    iso8859-10      ascii + iso-8859-10 (latin-6)
101       o  o    iso8859-11      ascii + iso-8859-11 (Thai)
102       o  o    iso8859-13      ascii + iso-8859-13 (Baltic Rim)
103       o  o    iso8859-14      ascii + iso-8859-14 (Celtic)
104       o  o    iso8859-15      ascii + iso-8859-15 (Latin-9)
105       o  o    iso8859-16      ascii + iso-8859-16
106       o  o    koi-8r          koi-8r (Russian)
107       o  o    cp1251          Cyrillic latin MS cp1251
108       o  o    jis             iso-2022-jp (rfc1496 7bit JIS)
109       o  o    iso-2022-jp-x0213 iso-2022-jp-3 (JIS X 0213:2000)
110                               a.k.a. jis-x0213
111       o  o    jis-x0213-strict iso-2022-jp-3-strict
112       o  o    iso-2022-jp-2004 iso-2022-jp-2004(JIS X 0213:2004)
113                               a.k.a. jis-x0213-2004
114       o  o    oldjis          iso-2022-jp-1978(JIS X 0208:1978)
115       o  o    cp50220         Microsoft codepage 50220
116       o  o    cp50221         Microsoft codepage 50221
117       o  o    cp50222         Microsoft codepage 50222
118       o  o    euc-jp          EUC-encoded JIS X 0208:1997
119       o  o    euc-x0213       EUC-encoded JIS X 0213:2000
120       o  o    euc-jis-2004    EUC-encoded JIS X 0213:2004
121       o  o    cp51932         EUC-encoded Microsoft codepage 932
122       o  o    euc-kr          EUC-encoded KS X 1001 Korian
123       o  o    euc7-kr         7bit EUC-encoded KS X 1001 Korian
124       o  o    uhc             Unified hangle (Windows cp949)
125       o  o    johab           KS X 1001-johab Korian
126       o  o    euc-cn          EUC-encoded GB2312 Chinese
127       o  o    euc7-cn         7bit EUC-encoded GB2312 Chinese
128       o  o    hz              HZ-encoded GB2312 Chinese
129       o  o    euc-tw          EUC-encoded CNS 11643 Chinese
130       o  o    gb12345         EUC-encoded GB12345 Chinese
131       o  o    gbk             GB2312 Extension(cp936) Chinese
132       o  o    gb18030         GB18030 chinese
133       o  o    big5            BIG5 (with Eten extension + EURO)
134       o  o    cp950           BIG5 (Microsoft cp950 + EURO)
135       o  o    big5-hkscs      BIG5 with HKSCS
136       o  o    big5-2003       BIG5-2003
137       o  o    big5-uao        BIG5-Unicode at On
138       o  o    sjis            Shift-jis (Microsoft cp943)
139       o  o    shiftjis-x0213  Shiftjis-encoded JIS X 0213:2000
140       o  o    shiftjis-2004   Shiftjis-encoded JIS X 0213:2004
141       o  o    sjis-docomo Shiftjis-encoded with NTT Docomo emoticons.
142       o  o    sjis-au          Shiftjis-encoded with AU emoticons.
143       o  o    sjis-softbank    Shiftjis-encoded with SoftBank emoticons.
144       o  o    oldsjis         Shift-jis (JIS X 0208:1978)
145       o  o    cp932           Shift-jis-encoded MS cp932
146       o  o    cp932w          Shift-jis-encoded MS cp932 with
147                               MS compatibility
148       o  o    viscii          VISCII (rfc1456) Vietnamise
149       o  o    viqr            VISCII (rfc1456-VIQR) Vietnamise
150       o  o    keis            Hitachi KEIS83/90
151       o  x    jef             Fujitsu JEF (basic support only)
152       o  x    ibm930          IBM EBCDIC DBCS Japanese
153       o  x    ibm931          IBM EBCDIC DBCS Japanese w.latin
154       o  x    ibm933          IBM EBCDIC DBCS Korian
155       o  x    ibm935          IBM EBCDIC DBCS Simpl. Chinese
156       o  x    ibm937          IBM EBCDIC DBCS Trad. Chinese
157       o  o    unicode         Unicode(TM) UTF-16LE
158       o  o    unicodefffe     Unicode(TM) UTF-16BE
159       o  o    utf7            Unicode(TM) UTF-7
160       o  o    utf8            Unicode(TM) UTF-8
161       x  o    nyukan-utf-8  nyukan-utf-16  Nyukan-moji(Japanese  nyukoku-kan‐
162       rikyoku gaiji). Encoding is utf-8 and utf-16 respectively.
163       o  x    arib-b24        ARIB B24 8-bit JIS-based
164       o  x    arib-b24-sj     ARIB B24 8-bit SJIS-based
165       x  o    transparent     Transparent mode (see below)
166
167
168     Codeset explanations
169       iso-8859-*
170              When  specified  as  output,  G0  =  GL  is ascii and G1 = GR is
171              iso-8859-*. 8bit encoding is used.
172
173       iso-2022-jp, jis
174              Encoding is iso-2022-jp-2 (RFC1496). G0  =  GL  is  JIS  X  0201
175              roman,  G1  =  GR is JIS X 0201 kana, G2 is iso-8859-1 and G3 is
176              JIS X 0212:1990 Supplementary Kanji.
177
178       jis-x0213, iso-2022-jp-3
179              Encoding is iso-2022-jp-3 (JIS X 0213:2000 based). G0  =  GL  is
180              JIS  X 0201 roman, For output, G1 = GR is JIS X 0201 kana, G2 is
181              iso-8859-1 and G3 is JIS X 0213 plane2 Kanji.
182
183       jis-x0213-strict
184              Encoding is subset of iso-2022-jp-3-strict (uses Plane 1  only).
185              For  output,  G0 = GL is JIS X 0201 roman, G1 = GR is JIS X 0201
186              kana, G2 is iso-8859-1 and G3 is not set. Output code using  JIS
187              X 0208 whenever possible. JIS X 0213 input is automatically rec‐
188              ognized.
189
190       jis-x0213-2004, iso-2022-jp-2004
191              Encoding is iso-2022-jp-2003:2004. For output, G0 = GL is JIS  X
192              0201  roman, G1 = GR is JIS X 0201 kana, G2 is iso-8859-1 and G3
193              is JIS X 0213 plane2 Kanji.
194
195       oldjis
196              Encoding is iso-2022-jp using old JIS X 0208:1978).  G0 = GL  is
197              JIS  X  0201 roman, G1 = GR is JIS X 0201 kana, G2 is iso-8859-1
198              and G3 is JIS X 0212 Supplementary Kanji.
199
200       euc-jp, euc
201              Encoding is 8-bit EUC using JIS X 0208:1997 character set.  G0 =
202              GL is ascii, G1 = GR is JIS X 0208, G2 is JIS X 0201 kana and G3
203              is JIS X 0212 Supplementary Kanji.
204
205       euc-x0213, euc-jis-2003
206              Encoding is 8-bit EUC-based JIS X 0213:2000.  G0 = GL is  ascii,
207              G1 = GR is X 0213:2000 plane 1, G2 is iso-8859-1 and G3 is JIS X
208              0213:2000 plane2 Kanji.
209
210       euc-jis-2004
211              Encoding is 8-bit EUC-based JIS X0213:2004.  G0 = GL  is  ascii,
212              G1  =  GR  is X0213:2004 plane 1, G2 is iso-8859-1 and G3 is JIS
213              x0213:2004 plane2 Kanji.
214
215       euc-kr
216              Encoding is 8-bit EUC using KS X 1001 Wansung character set.  G0
217              = GR is KS X1003, G1 = GR is KS X1001, G2 and G3 is not set.
218
219       euc7-kr iso-2022-kr
220              Encoding  is  iso-2022-kr  (rfc1557):  7-bit EUC using KS X 1001
221              Wansung character set.  G0 = GR is KS X1003, G1 is KS X1001,  G2
222              and G3 is not set.
223
224       euc-cn
225              Encoding is 8-bit EUC using GB 2312 simplified chinese character
226              set.  G0 = GR is ASCII, G1 = GR is GB2312, G2 and G3 is not set.
227
228       euc7-cn
229              Encoding is 7-bit EUC using GB 2312 simplified chinese character
230              set.  G0 = GR is ASCII, G1 is GB2312, G2 and G3 is not set.
231
232       hz
233              Encoding  is  HZ  encoded  (rfc1842)  GB 2312 simplified chinese
234              character set.  G0 = GR is ASCII, G1 = GR is GB2312, G2  and  G3
235              is not set.
236
237       euc-tw
238              Encoding  is  EUC  encoded CNS11643 Plane1/2 traditional chinese
239              character set. Subset of iso-2022-cn.  G0 = GR is ASCII, G1 = GR
240              is CNS11643 plane 1, G2 is CNS11643 plane 2 and G3 is not set.
241
242       gb12345
243              Encoding  is  8-bit EUC using GB 12345 (GBF) traditional chinese
244              character set.  G0 = GR is ASCII, G1 = GR is GB12345, G2 and  G3
245              is not set.
246
247       gbk, cp936
248              Encoding  is  GBK  simplified chinese character set.  G0 = GR is
249              ASCII and G1 = GR is GBK. G2 and G3 is not set.
250
251       gb18030 (experimental)
252              Encoding is GB18030 (ibm-1392, Windows cp54936) chinese  charac‐
253              ter set.  Uses ASCII as latin part.
254
255       big5
256              Encoding  is  Big5  traditional  chinese character set with ETen
257              extension.  Include Euro mapping.  Uses ASCII as latin part.
258
259       cp950
260              Encoding is Microsoft cp950-Big5 traditional  chinese  character
261              set.  Uses ASCII as latin part.
262
263       big5-hkscs (experimental)
264              Encoding  is  cp950-Big5  traditional chinese character set with
265              HKSCS extension.  Uses ASCII as latin part.
266
267       big5-2003 (experimental)
268              Encoding is Big5-2003  Taiwanese  standard  traditional  chinese
269              character set.  Uses ASCII as latin part.
270
271       big5-uao (experimental)
272              Encoding is Big5-UAO (http://uao.cpatch.org) traditional chinese
273              character set.  Uses ASCII as latin part.
274
275       VISCII (experimental)
276              Vietnamise VISCII (rfc1456) character set. Not TCVN-5712.
277
278       VIQR (experimental)
279              Vietnamise VISCII character set with VIQR encoding(rfc1456).
280
281       sjis
282              Encoding is Shift-encoded JIS X 0208:1997 character  set.   Note
283              that this is not cp932. Uses JIS X 0201 latin as latin(GL) part.
284
285       sjis-x0213, shift_jis-2000
286              Encoding  is  Shift-encoded  JIS using JIS X 0213:2000 character
287              set.
288
289       sjis-x0213-2004, shift_jis-2004
290              Encoding is Shift-encoded JIS using JIS  X  0213:2004  character
291              set.   10  newly defined character added, but Unicode mapping is
292              same as JIS X 0213:2000. Uses JIS  X  0201  latin  as  latin(GL)
293              part.
294
295       sjis-cellular (experimental)
296              Encoding is Shift-encoded JIS X 0208:1997 character set with NTT
297              Docomo/Vodafone(SoftBank) cellular phone glyph mapping.   Output
298              is not supported.
299
300       cp932 cp932w
301              Encoding  is Microsoft SJIS cp932 with NEC/IBM gaiji area, based
302              on Windows XP mapping. Uses ASCII as latin(GL) part.  --use-com‐
303              pat  and  --use-ms-compat is automatically enabled.  cp932w pro‐
304              vides further WideCharToMultiByte compatibility.
305
306       cp51932
307              Encoding is Microsoft EUC-based cp51932 with NEC/IBM gaiji area,
308              based  on  Windows  XP mapping.  Uses ASCII as G0 and JIS X 0201
309              kana as EUC G2 part.  G3 is not  used  for  output,  and  JIS  X
310              0212:2000  as  input.  --use-compat and --use-ms-compat is auto‐
311              matically enabled.
312
313       cp50220, cp50221, cp50222
314              Encoding is Microsoft JIS-based cp50220, cp50221,  cp50222  with
315              NEC/IBM gaiji area, based on Windows XP mapping.  For input, skf
316              accepts cp50220, 50221 and 50222.  Note that this codeset is NOT
317              compatible  with iso-2022.  Uses ASCII as default character set.
318              --use-compat and --use-ms-compat is automatically enabled.
319
320       oldsjis
321              Encoding is Microsoft SJIS (JIS X  0208:1978  a.k.a.  old  JIS).
322              Uses JIS X 0201 latin as latin(GL) part.
323
324       johab
325              Encoding  is  KS X1001(Johab) character set. Uses KS X1003 latin
326              as latin(GL) part.
327
328       uhc
329              Encoding is UHC (cp949) character set. Uses ASCII  as  latin(GL)
330              part.
331
332       unicode, unicodefffe, utf16, utf16le
333              Encoding  is  Unicode UTF-16 (v11.0). Input/Output default byte-
334              endian is little for unicode and big for unicodefffe, and  input
335              byte  order  mark  is  recognized. utf16 and unicodefffe is big-
336              endian. utf16le and unicode is little endian.   Output  includes
337              endian  mark  by  default unless --disable-endian-mark is speci‐
338              fied. Output range is within UTF-32 with surrogate  pair  unless
339              --limit-to-ucs2 is specified.
340              Note  that  ucs2  is  not  supported within lightweight language
341              extension in both in and output, because of SWIG's passing  data
342              structure limitation. Specify to ucs2 will generate error.
343
344       utf8
345              Encoding  is  UTF-8  encoded  Unicode  (v11.0).  Output  doesn't
346              include byte order mark unless  --enable-endian-mark  is  speci‐
347              fied.   Output  range is within UTF-32 unless --limit-to-ucs2 is
348              specified.  By default, CESU-8 is not accepted as input.  Option
349              --enable-cesu8  enables CESU-8 input for utf-8 converter. CESU-8
350              output is not supported.  For UTF-8, endian mark (BOM) is always
351              ignored.
352
353       utf7
354              Encoding is UTF-7 encoded Unicode (v11.0). Input/output range is
355              limited to UTF-16, and value above U+10000 is regarded as  unde‐
356              fined.  BOM is always ignored for input, and never used for out‐
357              put.
358
359       keis (experimental)
360              Encoding is Hitachi KEIS83/90. Output range is limited to EBCDIK
361              and JIS X 0208 area.
362
363       jef (experimental)
364              Encoding  is  Fujitsu  JEF.  Input only. Only basic part is sup‐
365              ported.
366
367       ibm930 (experimental)
368              Encoding is IBM DBCS Japanese with EBCDIC Kana
369
370       ibm931 (experimental)
371              Encoding is IBM DBCS Japanese with EBCDIC latin (ibm037)
372
373       ibm933 (experimental)
374              Encoding is IBM DBCS Korian with EBCDIC Wansung character set
375
376       ibm935 (experimental)
377              Encoding is IBM DBCS Simplified Chinese with EBCDIC Chinese
378
379       ibm937 (experimental)
380              Encoding is IBM DBCS Traditional Chinese with EBCDIC Chinese
381
382       koi8r
383              Russian KOI-8R code.
384
385       cp1250
386              Central Europian latin Microsoft cp1250 code
387
388       cp1251
389              Eastern Europian cyrillic Microsoft cp1251 code
390
391       arib-b24 arib-b24-sj
392              ARIB B24 code defined in ATIB-STD-B24 vol.1 part.2  chapt.  7.3.
393              b24 is 8-bit jis based, and b24-sj is sjis based.
394
395       nyukan-utf-8 nyukan-utf-16
396              Normalized  Unicode  UTF-8/UTF-16 based on Japanese law ministry
397              kokuji No. 582.
398
399       transparent
400              Transparent mode. Various code control features, include folding
401              and line end code conversion, is also ignored.
402
403
404     Shortcuts
405       -j     same as --oc=jis
406
407       -s     same as --oc=sjis
408
409       -e     same as --oc=euc-jp
410
411       -q     same as --oc=unicode
412
413       -z     same as --oc=sjis
414
415       -E     same as --ic=euc-jp. Assume input codeset is EUC-JP.
416
417       -J     same as --ic=jis. Assume input codeset is iso-2022-jp.
418
419       -S     same as --ic=sjis. Assume input codeset is shift JIS
420
421       -Q     same as --ic=utf-16 --input-little-endian.
422
423       -Z     same as --ic=utf8.
424
425
426     ISO-2022 Specific controls
427       Replaces  G0-3 after setting up according to specified input codeset by
428       assigned character set with this option. Note that this doesn't  change
429       any  codeset  properties  of  the  original  codeset, like language and
430       encoding.
431
432       --set-g0=`charset name'
433              Predefines specified code set to plane 0 (G0). Also set to GL at
434              initial state.
435
436       --set-g1=`charset name'
437              Predefines  specified  code set to right plane (G1). Also set to
438              GR at initial state.
439
440       --set-g2=`charset name'
441              Predefines specified code set to right plane (G2).
442
443       --set-g3=`charset name'
444              Predefines specified code set to right plane (G3).
445
446
447       Supported `char_set' is as follows. 'o' means the codeset can be speci‐
448       fied to set to the plane. 'x' means you can't. For unicode family code‐
449       sets, this option is ignored. For other  non-iso2022  categories,  this
450       option is not supported, and result is unpredictable.
451
452
453       g0 g1 g2 g3    codeset name   description
454       o  o  o  o     ascii          ANSI X3.4 ASCII
455       o  o  o  o     x0201          JIS X 0201 (latin part)
456       x  o  o  o     iso8859-1      ISO 8859-1 latin
457       x  o  o  o     iso8859-2      ISO 8859-2 latin
458       x  o  o  o     iso8859-3      ISO 8859-3 latin
459       x  o  o  o     iso8859-4      ISO 8859-4 latin
460       x  o  o  o     iso8859-5      ISO 8859-5 Cyrillic
461       x  o  o  o     iso8859-6      ISO 8859-6 Arabic
462       x  o  o  o     iso8859-7      ISO 8859-7 Greek-latin
463       x  o  o  o     iso8859-8      ISO 8859-8 Hebrew
464       x  o  o  o     iso8859-9      ISO 8859-9 latin
465       x  o  o  o     iso8859-10     ISO 8859-10 latin
466       x  o  o  o     iso8859-11     ISO 8859-11 Thai
467       x  o  o  o     iso8859-13     ISO 8859-13 latin
468       x  o  o  o     iso8859-14     ISO 8859-14 latin
469       x  o  o  o     iso8859-15     ISO 8859-15 latin
470       x  o  o  o     iso8859-16     ISO 8859-16 latin
471       x  o  o  o     tcvn5712       TCVN 5712 (Vietnamese)
472       x  o  o  o     ecma94         ECMA 94 Cyrillic (KOI-8e)
473       o  o  o  o     x0212          JIS X 0212:1990
474       o  o  o  o     x0208          JIS X 0208:1997
475       o  o  o  o     x0213          JIS X 0213 Plane 1:2000
476       o  o  o  o     x0213-2        JIS X 0213 Plane 2:2000
477       o  o  o  o     x0213n         JIS X 0213 Plane 1:2004
478       o  o  o  o     gb2312         Simplified Chinese GB2312
479       o  o  o  o     gb1988         Chinese GB1988(latin)
480       o  o  o  o     gb12345        Traditional Chinese GB12345
481       o  o  o  o     ksx1003        Korian KS X 1003(latin)
482       o  o  o  o     ksx1001        Korian KS X 1001
483       x  o  o  o     koi8-r         Cyrillic KOI-8R
484       x  o  o  o     koi8-u         Ukrainean Cyrillic KOI-8U
485       o  o  o  o     cns11643-1   Traditional Chinese CNS11643-1
486       x  o  o  o     viscii-r       RFC1496 VISCII (right plane)
487       o  o  o  o     viscii-l       RFC1496 VISCII (left plane)
488       x  o  o  o     cp437          Microsoft cp437 (US latin)
489       x  o  o  o     cp737          Microsoft cp737
490       x  o  o  o     cp775          Microsoft cp775
491       x  o  o  o     cp850          Microsoft cp850
492       x  o  o  o     cp852          Microsoft cp852
493       x  o  o  o     cp855          Microsoft cp855
494       x  o  o  o     cp857          Microsoft cp857
495       x  o  o  o     cp860          Microsoft cp860
496       x  o  o  o     cp861          Microsoft cp861
497       x  o  o  o     cp862          Microsoft cp862
498       x  o  o  o     cp863          Microsoft cp863
499       x  o  o  o     cp864          Microsoft cp864
500       x  o  o  o     cp865          Microsoft cp865
501       x  o  o  o     cp866          Microsoft cp866
502       x  o  o  o     cp869          Microsoft cp869
503       x  o  o  o     cp874          Microsoft cp874
504       x  o  o  o     cp932          Microsoft cp932 (Japanese)
505       x  o  o  o     cp1250     Microsoft cp1250(Central Europe)
506       x  o  o  o     cp1251         Microsoft cp1251 (Cyrillic)
507       x  o  o  o     cp1252         Microsoft cp1252 (Latin-1)
508       x  o  o  o     cp1253         Microsoft cp1253 (Greek)
509       x  o  o  o     cp1254         Microsoft cp1254 (Turkish)
510       x  o  o  o     cp1255         Microsoft cp1255
511       x  o  o  o     cp1256         Microsoft cp1256
512       x  o  o  o     cp1257         Microsoft cp1257
513       x  o  o  o     cp1258         Microsoft cp1258
514
515       --euc-protect-g1
516              In  EUC  input  mode, suppress sequences to set a charset to G1.
517              Such sequences are discarded.
518
519       --add-annon
520              Add announcer for JIS X 0208:1997 to X 0208 designate  sequence.
521              This option works only with iso-2022-based output.
522
523       --input-detect-jis78
524              Distinguish JIS X 0208:1978 codeset and JIS X 0208:1997 codeset.
525              By default, these two charsets are regarded as X 0208:1997. This
526              option is valid only when input encoding is JIS (iso-2022-jp).
527
528
529     JIS X 0212(Supplement Kanji code) Support
530       --x0212-enable
531              skf  by default does not output JIS X 0212 code in JIS/EUC mode.
532              This option enables use of JIS X 0212 part.  Non-Japanese  code,
533              Shift_JIS  variants,  Unicode or KEIS output ignore this option.
534              Note that this option is supported for  backward  compatibility.
535              It may not be supported in future versions.
536
537
538     Unicode coding specific control options
539       skf-2.10 is conformed on Unicode 11.0 specification.
540
541       --use-compat --suppress-compat
542              By --suppress-compat, skf substitutes characters in unicode com‐
543              patibility planes (U+F900 - U+FFFD) to appropriate characters in
544              non-compatibility planes. If this substitution is enabled, these
545              characters is converted to variants or undefined.  By --use-com‐
546              pat,  skf  outputs  character in this area as it is.  Default is
547              --use-compat.  Several codesets controls this as codeset feature
548              (i.e. Use compatibility planes). See codeset section.
549
550       --use-ms-compat
551              When output is Unicode, make Unicode map to be Microsoft windows
552              compatible). This only changes conversion for  some  symbols  in
553              JIS-Kanji,  and  adding  --use-compat  option is recommended for
554              roundtrip conversion. If you need more strict compatibility, try
555              cp932w for input codeset.
556
557       --use-cde-compat
558              When  output  is  Unicode, make translation CDE standard codeset
559              compatible.
560
561       --little-endian
562              When output is UTF-16le/be, use little endian byte-order.
563
564       --big-endian
565              When output is UTF-16le/be, use big endian byte-order.
566
567       --disable-endian-mark --enable-endian-mark
568              When output is UTF-16 or UTF-8, do not use/use byte order  mark‐
569              ing.  To  make UTF-16N, use this option with --little-endian. By
570              default, BOM is enabled for UTF-16 and disabled for UTF-8.
571
572       --input-little-endian
573              When input is UTF-16le/be, assume input is little  endian  byte-
574              ordered.
575
576       --input-big-endian
577              When  input  is  UTF-16le/be,  assume  input is big endian byte-
578              ordered.
579
580       --endian-protect
581              Do not use endian mark in input stream. Endian mark is just dis‐
582              carded.  This is off by default.
583
584       --limit-to-ucs2
585              Do  not  use > 0x10000 area code in Unicode (i.e. limits code to
586              BMP area).  This option doesn't limit  internal  code  range  in
587              skf. This is off by default.
588
589       --disable-cjk-extension
590              Treat  CJK  extension  A/B areas as undefined. This is off (i.e.
591              these areas are enabled) by default.
592
593       --enable-cesu8
594              Enable CESU-8 input in utf-8  codeset.  Ignored  for  any  other
595              codesets.
596
597       --non-strict-utf8
598              Enable broken (decodable but not obeying specs.) utf-8 input. If
599              you need this option, proceeds with extra care.
600
601       --enable-nfd-decomposition --disable-nfd-decomposition
602              Enable/Disable Unicode Normalized decomposition. Default is dis‐
603              abled.
604
605       --enable-nfda-decomposition --disable-nfda-decomposition
606              Enable/Disable  Apple-compatible  Unicode  Normalized decomposi‐
607              tion.  Default is disabled.
608
609       --oldcell-to-emoticon
610              Convert old cell-phone gaiji area to  emoticon.  Supported:  NTT
611              Docomo/AU emoticons. A reverse mapping is not supported.
612
613
614
615     Miscellanious codeset related options
616       --old-nec-compat
617              Enable  old  NEC  kanji sequence (ESC-K,H). Needs compile option
618              --enable-oldnec at configuration.
619
620       --no-utf7
621              Assume input codeset  is  *NOT*  UTF-7  encoded  Unicode.   This
622              option disables input utf7 testing.
623
624       --no-kana
625              Assume input codeset does *NOT* include JIS X 0201 kana.
626
627       --input-limit-to-jp
628              Tell  detection  mechanism  that  input is some kind of Japanese
629              codeset.
630
631
632   OUTPUT Conversions options
633       skf is intended to output stream to stdout,  buf  nkf-compatible  file-
634       encoding change option is also provided.
635
636       --overwrite[=SUFFIX] --in-place[=SUFFIX]
637              converts  encoding  of  file(s)  specified as input. --overwrite
638              preserves file change date. If SUFFIX parameter is added,  input
639              file is back-up'ed with a name appended this SUFFIX.
640
641       skf has various features to fix output files appropriate in local envi‐
642       ronment.  Most of these are controlled  by  extended  control  switches
643       described in this section.
644
645       --use-g0-ascii
646              set  G0(=GL) for output encoding to ASCII, ignoring codeset des‐
647              ignation.
648
649     X-0201 Kana/latin conversions
650       skf by default converts X-0201 kanas to X-0208 kanas. To output  X-0201
651       kana  as it is, use one of following options. When output is designated
652       to EUC or SJIS, these three options enable X-0201 kana output  by  ways
653       provided  by  each encoding. When Unicode output is specified, (equiv.)
654       kana part output is controlled by --use-compat, not following switches.
655       Valid only when output codeset is NOT Unicode family.
656
657       --kana-jis7
658              use SI/SO locking shift sequence to designate X-0201 kana.  This
659              switch is valid for jis, jis-x0213 and  cp50220  (i.e.  cp50221)
660              encoding.  For other codesets, this option is ignored.
661
662       --kana-jis8
663              output X-0201 kana using 8-bit code right plane.  This switch is
664              valid for jis and jis-x0213 encoding.  For other  codeset,  this
665              option is ignored.
666
667       --kana-esci --kana-call
668              use  ESC-(-I to designate X-0201 kana.  This switch is valid for
669              jis, jis-x0213 and cp50220 (i.e. cp50222) encoding.   For  other
670              codeset, this option is ignored.
671
672       --kana-enable
673              If  output  is  EUC-JP  or cp51932, use X-0201 kana with G2.  If
674              SJIS output, it is same as --kana-jis8.  When JIS output, it  is
675              same as --kana-call.
676
677       --use-iso8859-1
678              Enable iso-8859-1 output. Iso-8859-1 is invoked to G1 and set to
679              GR plane.
680
681
682     URI/TeX format conversion feature options
683       With Unicode(tm) family output  codings,  skf  output  non-ascii  latin
684       character  part  as  it is, but with other output codings, skf converts
685       these characters using following rules:
686
687       (1) If a code is defined in a specified output codeset, specified  code
688       point is used for output.
689       (2)  If  one  of  following html convert modes are enabled (i.e. --con‐
690       vert-html --convert-sgml) and the code is defined in html/sgml codeset,
691       it is converted to entity-reference or codepoint reference.
692       (3)  If tex convert mode enabled and the code is defined in tex expres‐
693       sion, it is converted to tex format.
694       (4) If the code is a kind of combined ligatures, it is shown by  a  set
695       of characters.
696       (5) A kind of replacement character is shown, with warning.
697
698       --convert-html --convert-sgml--convert-xml
699              Enable html convert mode. This mode is cleared by --reset. These
700              two options are synonyms, and are treated as same option.
701
702       --convert-html-decimal
703              Enable html  code-point  decimal  convert  mode.  This  mode  is
704              cleared by --reset.
705
706       --convert-html-hexadecimal
707              Enable  html  code-point  hexadecimal convert mode. This mode is
708              cleared by --reset.
709
710       --convert-tex
711              Enable TeX convert mode. This mode is cleared by --reset.
712
713       --convert-perl
714              Enable Perl5 literal convert  mode.  This  mode  is  cleared  by
715              --reset.
716
717       --convert-java
718              Enable  Java  literal  convert  mode.  This  mode  is cleared by
719              --reset.
720
721       --convert-python
722              Enable Python literal convert mode.  This  mode  is  cleared  by
723              --reset.
724
725       --use-replace-char
726              In Unicode, use unicode replacement chatacter (U+fffc) for unde‐
727              fined chatacter.
728
729
730 Extended Options
731   Encoding/Decoding control options
732       --decode=`encoding scheme'
733
734       --encode=`encoding scheme'
735              Specify an decoding/encoding scheme for input stream.  Supported
736              encoding  schemes  for  decoding  are  `hex',  'mime', 'mime_q',
737              'mime_b', 'uri', 'ace', 'hex_perc_encode', Each option means CAP
738              hex-code,  mime, mime Q-encoding, mime B-encoding, uri character
739              reference, ACE punycode, uri percent notation, base64,  Q-encod‐
740              ing, rfc2231 and rot13/47 respectively. 'none' means no decode.
741              For encoding, 'hex', 'mime_b', 'mime_q', 'uri', 'ace', 'cap',
742               'hex_perc_encode',  'base64'  and  'none' are supported. EBCDIC
743              related codesets and some already  ascii-encoded  codeset  (e.g.
744              UTF-7) output with encoding is not supported.
745              Only  one  decode/encode  option  is valid, and if more than one
746              option is specified, the last one is used.   When  one  of  mime
747              decodings  is specified, base text is assumed to be EUC encoding
748              unless specified otherwise.  Except  rot,  which  assumes  input
749              stream is Shift_JIS, EUC or iso-2022-jp, these encodings assumes
750              input stream is ascii (as defined in  RFC2045).  Some  encodings
751              may  co-exist  with  encoding, but this is not guaranteed. Espe‐
752              cially, if input is UTF-16/UCS2 code, these encoding is  ignored
753              in skf.
754
755       --mime-ms-compat
756              treat  japanese  generic codesets as Microsoft cp932 compatible.
757              More specifically, with this option skf  treats  iso-2022-jp  as
758              cp50220, euc-jp as cp51932 and Shift_JIS as cp932w.  --mime-per‐
759              sistent skf detects address-like strings and excludes them  from
760              mime  encoding.   This option disables such behavior. Default in
761              nkf-compatible mode.
762
763
764   Shortcut
765       -m     same as --decode=mime
766
767       -mB    same as --decode=mime_b
768
769       -mQ    same as --decode=qencode
770
771       -m0    same as --decode=none
772
773       -M     same as --encode=mime_b
774
775       -MB    same as --encode=base64
776
777       -MQ    same as --encode=qencode
778
779   End of line control options
780       --lineend-thru
781              Output end-of-line code as it is. Also output ^Z code as it  is.
782              This is default.
783
784       --lineend-cr --lineend-mac-Lm
785              Use  CR  as  end-of-line  code.  Also  delete ^Z code from input
786              stream.
787
788       --lineend-lf --lineend-unix-Lu
789              Use LF as end-of-line code.  Also  delete  ^Z  code  from  input
790              stream.
791
792       --lineend-crlf --lineend-windows-Lw
793              Use  CR+LF  as  end-of-line code. Also delete ^Z code from input
794              stream.  This option doesn't preserve original order of  cr  and
795              lf.
796
797       --input-cr
798              Assume input stream uses CR as end-of-line code.
799
800       --input-lf
801              Assume input stream uses LF as end-of-line code.
802
803       --input-crlf
804              Assume input stream uses CR+LF as end-of-line code.
805
806       -F[line_length[-kinsoku]]
807
808       -f[line_length[-kinsoku]] -f[line_length[+kinsoku]]
809              Wrap  input  lines  by  line_length  columns.  f  option deletes
810              CR/LF's in input, and F option doesn't delete them. For Japanese
811              convension,    both    gyoutou-kinsoku(by   burasage-gumi)   and
812              gyoumatsu-kinsoku(by oidasi-gumi) is  supported.  The  burasage-
813              length  is  controlled  by  kinsoku  option.  Default  value for
814              line_length is 66, and must be < 1000. Default value for kinsoku
815              is  5,  and  must be <= 10. In 'f' option, skf autodetects para‐
816              graph and retains some CR/LF. 2nd 'f' option format  (with  '+')
817              disables  this  behaviour.   In  nkf  compatible mode, some fold
818              behaviors change as follows.
819              (1) Default line_length is set to 60, and kinsoku value is 10.
820              (2) alpha numeric characters become gyoutou-kinsoku characters.
821
822   File control options
823       --filewise-detect --force-reset
824              Reset and re-detect input code set at the start of each file.
825
826       --linewise-detect
827              Reset and re-detect input code set at the start of each line.
828
829
830   Compatibility options
831       --nkf-compat
832              interpret following options as nkf compatible manners.  -l,  -d,
833              -c,  -x,  -X,  -w  and  -W works as nkf2.x -f and -F behavior is
834              changed as shown above.  -T, -i, -o is not supported.   Most  of
835              other  nkf  options  and  switches also work like nkf, except in
836              case of error.
837
838       --skf-compat
839              interpret following options as skf-native manners.
840
841       -r     nkf-compatible rot. Works only with --nkf-compat  mode.  Allowed
842              input encodings are limited to JIS/Shift_JIS/EUC.
843
844       -h[123]--hiragana--katakana--katakana-hiragana
845              -h,  -h1 and --hiragana converts all kanas to hiragana.  -h2 and
846              --katakana   convert   all   kanas   to   katakana.    -h3   and
847              --katakana-hiragana swap katakana and hiragana.
848
849       --nkf-help
850              show option difference/compatibility between skf and nkf.
851
852       --in-place[=SUF]--overwrite[=SUF]
853              replace specified file with converted codeset. overwrite retains
854              file create time stamp.  If a suffix is  given,  the  suffix  is
855              added to output file name and input file is not removed.
856
857
858   Lightweight language specific options
859       skf plugin for lightweight language has subset of options. More specif‐
860       ically, file input/output related  options(-b,  -u,  --overwrite  --in-
861       place,  --filewise-detect --linewise-detect --show-filename --suppress-
862       filename) and UTF-16 output is disabled(except ruby or python3).
863
864
865     Ruby-1.9.x/2.x specific options
866       Since ruby 1.9, ruby uses  CCS  string  handling.  skf  returns  output
867       string  with  specified codeset. Following options override this behav‐
868       ior.
869
870       --rb-out-ascii8bit
871              returns string with ascii-8bit encoding.
872
873       --rb-out-string
874              returns string with specified encoding.
875
876     Python-3.x specific options
877       Since native codeset representation  in  python3.x  is  UCS2/UCS4,  skf
878       behaves  differently  with  output codeset option. If output codeset is
879       either UTF-16 or UTF-32(in wide mode), skf returns Unicode object,  and
880       for  all  other  codesets  skf  returns  binary array object. Following
881       options change this behavior.
882
883       --py-out-binary
884              use psuede unicode binary stream to output.
885
886       --py-out-string
887              use binary array object on UTF-16/32 output. BOM is enabled.
888              skf accepts either a binary  array  or  an  unicode  object  for
889              input.
890
891
892   Misc. Control options
893       --disable-space-convert --enable-space-convert
894              skf  converts  an ideographic space into two ascii spaces.  Dis‐
895              able option disables, and enable option enables  this  behavior.
896              Default is disabled.
897
898       --html-sanitize
899              Convert  several characters in HTML document to entity reference
900              expression. Specifically, "!#$&%()/<>:;?´ are escaped by entity-
901              references.
902
903       --filewise-detect --force-reset
904              If multiple input files are given, detect input codeset for each
905              file.
906
907       --linewise-detect
908              Detect input code  line-wise.  Note  this  option  weakens  code
909              detect correctness.
910
911       --reset
912              Reset  all  flags  specified by extended controls and enviroment
913              variables.
914
915       --inquiry --guess
916              skf detects code and output detect result to stdout. No  filter‐
917              ing  output  is  performed.  If  multiple input files are given,
918              --show-filename is automatically enabled.
919
920       --hard-inquiry
921              Similar as inquiry, but reports both  code  and  an  end-of-line
922              character.
923
924       --suppress-filename
925              When  inquiry(--inquiry)  is  on, this option disables file name
926              output.  This option overrides --show-filename.
927
928       --show-filename
929              When inquiry(--inquiry) is on, this option adds each  file  name
930              to output.
931
932       --invis-strip
933              Delete  all  escape  sequences  not  belonging  to ISO-2022 code
934              extension. This is intended to replace invisstrip  command  bun‐
935              dled in inews package.
936
937       -I     Warn if input has unassigned code points.
938
939       -v     print version information and exit.
940
941       --help print brief help and exit.
942
943       --show-supported-codeset
944              Display  supported  codesets  (input)  and  exit. Both canonical
945              names (left side) and detailed names are shown.  This  canonical
946              name  can  be  used  as  MIME charset and also as ic-option code
947              specification.
948
949       --show-supported-charset
950              Display supported character sets (output) and exit. Both canoni‐
951              cal  names and detailed names are shown. Some charsets with spe‐
952              cial treatments (i.e.  meaningless as set-g* parameters)  inten‐
953              sionally lacks addressable cnames.
954
955

FILES

957       /usr/(local/)share/skf/lib/   (Unices)
958
959       /Program Files/skf/share/lib (MS Windows)
960              These  directories  are where external codeset conversion tables
961              go.  The location that current  skf  assumes  are  shown  by  -h
962              option.
963
964

AUTHOR

966       skf  is  written  by Seiji Kaneko (efialtes@osdn.jp) based on idea from
967       nkf written by Itaru Ichikawa (ichikawa@flab.fujitsu.co.jp) X 0213 code
968       table  is  derived from work of earthian@tama.or.jp.  Some codeset map‐
969       ping is derived from various sources. Detailed origin is shown in copy‐
970       right document included in this distribution.
971
972

ACKNOWLEDGEMENT

974       skf   is   inspired   by   works   or  requests  by  shinoda@cs.titech,
975       kato@cs.titech, uematsu@cs.titech, void@global ohta@ricoh,  Hinata(HKE)
976       Ashizawa(CRL)  Kunimoto(SDL) Oohara(Univ of Kyoto), Jokagi(elf2000) and
977       Naruse (at osdn.jp). Thanks.
978
979

BUGS AND LIMITATIONS

981       1. skf can handle mixed coding with  some  limitations.  However,  code
982       detection  tends to fail for mixed code, and giving explicit input code
983       set is strongly encouraged, if codeset is known beforehand.
984       In case of need, --linewise-detect option may help, but code  detecting
985       will more likely fail.
986
987       2. skf implements ISO-2022 with following exceptions.
988        i)  GL 0x20 is always space. Even when 96-character codeset is invoked
989       to GL.
990        ii) Sequences for setting codes to C1 and C2 are always ignored.
991        iii) If unknown sequence is given to G0, G0 is set to ascii, and lock‐
992       ing/single  shift  is  cleared. Unknown sequece call to set to G1-G3 is
993       just ignored.
994        Private charset is also not supported and is ignored.
995        iv) Sequences for 96 character multibyte coding is ignored (Currently,
996       no codeset is registered).
997        v) Calling UTF-8, UTF-16 coding system from iso-2022 is supported, and
998       returns to previous coding system by standard return.
999        Callings and returns to/from other coding schemes are ignored.
1000        vi) For supporting some of cellular phone glyphs, several private (not
1001       registered) codesets are defined in skf, and can be called by appropri‐
1002       ate sequences.
1003
1004       3. Error output coding is controlled by LOCALE environment variables in
1005       UN*X system. skf doesn't take care of situations like stdout and stderr
1006       are redirecting into a same stream. Such case should be handled by user
1007       side.
1008
1009       4. skf converts KEIS/JIS X 0213 code using CJK-extension B area and CJK
1010       compatibility area. For this reason, X 0213  and  KEIS  convert  result
1011       varies depending on --use-compat and --limit-to-ucs2 switches.
1012
1013       5.  JIS X 0207:1979 is not supported. JIS X 0211:1987 is designed to be
1014       supported (i.e. common terminal control sequence will be  transparently
1015       passed to output).
1016
1017       6.  Even  if  unbuffer  option(-u)  is specified, some code-translation
1018       related bufferings are still performed (in MIME, kana, VIQR etc.).
1019
1020       7. skf-1.9x or later recognizes and handles languages in iso639-1(alpha
1021       2).  iso639-2 is not supported as a valid language set.
1022
1023       8. Unicode IVS is not supported. Sequences are just discarded.
1024
1025       9.  skf-1.9x  or  later does not retain Macintosh RLO-ordered character
1026       property.  Codesets with this kind of codes are not supported.
1027
1028

Notes

1030       1. Extended options are changed extensively since skf-1.9. Some archaic
1031       options (eg. -B, -@ and -r) have been deleted from this version.
1032
1033       2.  skf  is originally forked project from nkf, but doesn't contain any
1034       nkf codes now.  Copyright notice is retained by honor.
1035
1036       3. From version 1.9, default Japanese character set assumed by skf  has
1037       changed  to JIS X 0208:1990 with Microsoft Japanese Windows gaiji (i.e.
1038       CP932).
1039
1040       4. Code autodetection is not perfect by design. If  it  has  failed  to
1041       detect  input code properly, please give input code information explic‐
1042       itly.
1043
1044       5. Some ligatures in Unicode, cp932  gaiji  and  KEIS83  are  converted
1045       using  JIS  X  0124  and other convention.  During this conversion, its
1046       byte length is not preserved.
1047
1048       6. skf is intended to  pass  ANSI  compatible  terminal  control  codes
1049       transparently, but this is not guaranteed.
1050
1051       7.  nkf's  -i and -o options works only in nkf-compat mode. It is obso‐
1052       lete option in 1.97, and valid only when iso-2022-jp and  without  con‐
1053       sidering output codeset specifications.
1054
1055       8.  For unconverted character, skf uses geta and undefined character as
1056       --use-replace-char option.  If  output  codeset  doesn't  contain  geta
1057       code, skf prefers 'black square character', then uses '.' respectively.
1058
1059       9. There are some undocumented options. These options should be consid‐
1060       ered as highly experimental.
1061
1062       10. In lineend_thru mode and using folding, skf remembers order  of  cr
1063       and  lf appears in stream, and use that order.  For this design, if skf
1064       needs to  output  line-end  character  before  any  line-end  character
1065       appears in input stream, input order may not be preserved.
1066
1067       11. NKF-compatibility
1068       1) --prefix, some --fb's and --no-best-fit-chars are not supported.
1069       2) MSDOS (and -T), --exec-in and --exec-out are not supported.
1070       3) MIME decoding/encoding handling behaviors differ in various ways.
1071       4)  lineend  conversion  acts  differently. Results may not be same for
1072       some messy text.
1073       5) -r option and --decode=rot is different. See  each  option  descrip‐
1074       tion.
1075       6)  detected codeset name is not compatible with nkf. --help and --ver‐
1076       sion return different results.
1077       7) in-place and overwrite suffix with * is not supported.
1078
1079       12. Conversion to NYUUKAN GAIJI is as follows
1080       1)   Kanji   codes   in   JIS   X0208(1997),   JIS   X0212(1990),   JIS
1081       X0213(2004/2012),
1082        Houmusho-kokuji No.582 beppyou No.1 are sent to output as it is.
1083       2)  Kanji codes in beppyou No.4-2 leftmost columns are converted to the
1084       first
1085        priority character in the table. If  the  second  priority  characters
1086       appear,
1087        the codes are sent to output as it is.
1088       3)  Other  kanji codes are converted as undefined codes. See above con‐
1089       version method.  Non-kanji codes (latins, glyphs etc.) are sent to out‐
1090       put as it is.
1091
1092       13. ARIB B24 compatibility
1093       1) Input only. ARIB B24 output is not supported.
1094       2) Neither international encoding nor X0213 extension are supported.
1095       3)  Macro  define  sequences are suppressed. These sequences are recog‐
1096       nized and
1097        discarded.
1098       4) Without specifying arib codeset, skf treats Arib-defined codepage as
1099       follows.
1100         i) private codepage are supported. ascii/jis x-0201 0x5f is not modi‐
1101       fied.
1102         ii) macro define/invoke and rpc invoke does not work.  These  charac‐
1103       ters are
1104           discarded.
1105
1106

Notice

1108       Unicode(TM)  is  a trademark of Unicode, Inc. Microsoft and Windows are
1109       registered trademarks of Microsoft corporation. Macintosh is  a  regis‐
1110       tered  trademark of Apple Inc. Vodafone is a trademark of Vodafone K.K.
1111       Other names and terms may be trademarks  or  registered  trademarks  of
1112       their  respective  owner.  Trademark symbol (TM) may be omitted in this
1113       manual page.
1114
1115
1116
1117
1118                                  10/Aug/2018                           SKF(1)