1KAKASI(1)                   General Commands Manual                  KAKASI(1)
2
3
4

NAME

6       KAKASI  -  Kanji  kana  simple  inverter  (between Kanji, both Kana and
7       Romaji)
8

SYNOPSIS

10       kakasi [options] [jisyo1 [jisyo2 [jisyo1,,]]]
11

DESCRIPTION

13       KAKASI In Japanese sentences are often made up  a  mixture  of  Chinese
14       characters (Kanji), Kana (Hiragana and Katakana) and Romaji (Latin pho‐
15       netical pronunciation).  This program converts between these four  dif‐
16       ferent ways of writing Japanese.
17
18       This  program  is  useful  for those whose terminal or desktop does not
19       support the native display of Japanese.  Also this is a great tool  for
20       those  who  are  learning Japanese (international students and children
21       etc).
22
23       A word can be passed into the standard input (stdin), then it is trans‐
24       lated  and  output  to standard out (stdout).  In the following example
25       the "bunchu" Kanji is converted into Hiragana.
26
27                 kakasi -JH < document
28
29       Since version 2.3.0 text with spaces in-between  words  has  been  sup‐
30       ported.  In the following example the output has spaces in-between each
31       word.
32
33                 kakasi -w < document
34
35       Since version 2.3.5 level conversion mode has been supported.   In  the
36       following  example, simple Kanjis are left them unconverted, and diffi‐
37       cult Kanjis are translated into Hiragana.
38
39                 kakasi -l4 < document
40
41       KAKASI It is possible to convert letters  to  alphabetical  characters.
42       Also  Katakana  letters in the JIS x0201 character set and the Hiragana
43       in the JIS x0208 character set can be converted between each other.
44
45       KAKASI The following character set in brackets which is displayed.
46
47       ASCII (a) Known as "ascii" character set.
48
49       JISROMAN (j)
50                 Known as "jis roman" character set.
51
52       GRAPHIC (g)
53                 It is the DEC graphic character set.
54
55       Katakana (k)
56                 JIS x0201, defined as part of the GR character set.
57
58                 As a matter of convinience, JIS x0208 is  divided  as  stated
59                 below.
60
61       Kanji (J)
62                 JIS x0208 characters included between 16 and 94 sections.
63
64       Hiragana (H)
65                 JIS x0208 characters included in section 4 (Hiragana)
66
67       Katakana (K)
68                 JIS x0208 characters included in section 5 (Katakana)
69
70       Sign (E)
71                 JIS  x0208  characters  included in section 1,2,3,6,7, and 8.
72                 (Note that section 9-15 are undefined in JIS x0208.)
73
74       Translation between the following character sets are available.
75
76       ASCII        -> JISROMAN, Sign
77
78       JISROMAN     -> ASCII, Sign
79
80       GRAPHIC      -> ASCII, JISROMAN, Sign
81
82       JISx0201 Katakana
83                    -> ASCII, JISROMAN, Kana, Hiragana
84
85       Sign         -> ASCII, JISROMAN
86
87       Katakana     -> ASCII, JISROMAN, JISx0201 Katakana, Hiragana
88
89       Hiragana     -> ASCII, JISROMAN, JISx0201 Katakana, Kana
90
91       Kanji        -> ASCII, JISROMAN, JISx0201 Katakana, Kana, Hiragana
92
93       With conversion of ASCII and the JISROMAN  the  alphabetical  character
94       conversion  is  done  from  JISx0201  Katakana,  Katakana, Hiragana and
95       Kanji.
96
97       Example:
98
99           1. All kanji characters are converted to Hiragana.
100
101               kakasi -JH
102
103           2. All JIS x0208 characters are converted to JIS X 0201.
104
105
106               kakasi -Hk -Kk -Jk -Ea
107
108           3. All characters are converted to JIS X 0208.
109
110               kakasi -aE -jE -gE -kK
111
112           4. All characters are converted to ascii and words are separated.
113
114               kakasi -Ha -Ka -Ja -Ea -ka
115
116           5. Exchange between Katakana and Hiragana characters.
117
118               kakasi -HK -KH
119

CONVERSION DESIGNATED CHARACTER SET

121       Some character sets are categorized by kakasi and indicated by  follow‐
122       ing mnemonics: a, j, g, k, E, H, K, J.
123
124             a --- ASCII characters
125             j --- JIS ROMAN ( nearly equal to ASCII, "~" and "
126                   different ) defined by JIS x0201
127             g --- DEC Graphic Characters
128             k --- KATAKANA defined by JIS x0201
129
130       E, H, K, and J are included in JIS x0208 character set.
131
132             J --- KANJI characters of JIS x0208.
133             H --- HIRAGANA characters of JIS x0208.
134             K --- KATAKANA characters of JIS x0208.
135             E --- Rest of above characters of JIS x0208 which includes
136                   alphabets, numbers, symbols and so on.
137
138       -(from)(to)  means  conversion  from character set (from) to (to).  For
139       example, -JK option causes KANJI characters are converted to  HIRAGANA.
140       Combinations  in  the  following  table  are  available.  (You must not
141       remember it, because the -h shows same information)
142
143             to\from|    a    j    k    E    H     K    J    g
144             -------+--------------------------------------------
145                a   |    -    o    o1   o    o1    o1   o12  o
146                j   |    o    -    o1   o    o1    o1   o12  o
147                k   |              -         o     o    o2
148                E   |    o    o         -                    o
149                H   |              o         -     o    o2
150                K   |              o         o     -
151
152             o  -- converted.
153             1  -- converted to Romaji.
154             2  -- Kanji -> Kana conversion.
155
156

KANJI CODING CONVERSION

158       Unfortunately, several coding systems are used in Japan and  JIS  x0208
159       standard  are changed at 1983. Therefore, KAKASI can automatically dis‐
160       tinguish the coding system and coding revision and then  use  the  same
161       output  coding  system  if  the  document  does  not  include JIS x0201
162       KATAKANA.  If JIS x0201 KATAKANA is included  or  you  wish  to  change
163       kanji coding system, you may use the next options.
164
165             -i : input coding
166             -o : output coding
167
168             jis -- Widely used on the internet. (Ex: fj, jp, .. newsgroups)
169                    Derived from ISO-2022 coding manner.
170                    newjis: JISx0208 (1983) invoked by ESC-$-B.
171                    oldjis: JISx0208 (1978) invoked by ESC-$-@.
172             euc,dec -- Often used in UNIX like computers. JISx0208 is
173                    assigned to GR ( MSB is 1 ). The major difference between
174                    euc and dec is assignment of JISx0201 KATAKANA and
175                    the DEC graphic character.
176             sjis -- Defined by Microsoft Corp. Widely used on the personal
177                    computers ( MSDOS, Mac, .. )
178             utf8 -- Current international standard.  All modern OSs use this
179                    encoding of the Unicode character set as the default.
180
181

ROMAJI CONVERSION

183       Kanji kana conversion options. Used with -J? option.  There are 2 types
184       of Romaji writing.  The first is the Kunrei method defined by  Japanese
185       government,  and  the  second  is  the Hepburn method.  I think Hepburn
186       method sounds naturally to foreigners.
187
188             -rhepburn : Hepburn Method (default)
189             -rkunrei  : Kunrei Method
190
191

OTHER OPTIONS

193             -p: List all possible readings. If there exist two or more
194                 possible readings, KAKASI shows them in braces {aaa,bbb}.
195             -s: Insert a separate character between words.
196             -f: Furigana mode. Shows the original kanji word with reading.
197             -c: Skip characters within word. ( default TAB CR LF BLANK )
198             -C: Capitalize Romaji word (with -Ja or -Jj option)
199             -U: Upcase romaji word (with -Ja or -Jj option)
200             -u: Call fflush().
201             -w: wakatigaki mode. 'wakatigaki' is word segmentation for
202                 Japanese sentences.
203
204
205

DICTIONARIES

207       KAKASI can accept additional dictionary to the system dictionary.   The
208       acceptable  format of additional dictionary is SKK format, and Wnn for‐
209       mat, and so on.  Namely, each record is one line with two fields,  Yomi
210       (reading) and Jukugo(idiom).  Fields are separated with commas (or TAB,
211       or blank).  The kanji code is restricted to JIS or  EUC.   See  another
212       document named JISYO for more details.
213

ENVIRONMENT VARIABLES

215       The behavior is affected by the following environment variables.
216
217       KANWADICTPATH
218              Specifies  a  path  of kanwadict (full-path including filename).
219              Default value is $prefix/share/kakasi/kanwadict.
220
221       ITAIJIDICTPATH
222              Specifies a path of itaijidict (full-path  including  filename).
223              Default value is $prefix/share/kakasi/itaijidict.
224

AUTHOR

226       Hironobu Takahasi <takahasi@tiny.or.jp>
227

FILES

229       $prefix/share/kakasi/kanwadict
230              It  is  a binary dictionary of KAKASI.  It is automatically con‐
231              verted from kakasidict by mkkanwa when the package is installed.
232

SEE ALSO

234       mkkanwa(1)
235

DIAGNOSTICS

237       Return status except 0 when there is any trouble.
238

BUGS

240       Report bugs to KAKASI Project <kakasi-dev@namazu.org>.  Please  DO  NOT
241       CONTACT to the originator (Takahasi-san).
242

NOTE ABOUT ENGLISH MANUAL

244       The  content  of English manual is not exactly same as that of Japanese
245       manual.
246
247
248
249
2504.3 Berkeley Distribution            LOCAL                           KAKASI(1)
Impressum