1UPMENDEX(1) General Commands Manual UPMENDEX(1)
2
3
4
6 upmendex - Multilingual index processor
7
9 upmendex [-ilqrcgf] [-s sty] [-d dic] [-o ind] [-t log] [-p no] [--] [
10 idx0 idx1 idx2 ...]
11 upmendex --help
12
14 The program upmendex is a general purpose multilingual hierarchical
15 index generator working with upLaTeX, XeLaTeX and LuaLaTeX; it accepts
16 one or more input files (.idx; often produced by a text formatter such
17 as LaTeX families), sorts the entries, and produces an output file
18 which can be formatted. It supports Latin (including non-English),
19 Greek, Cyrillic, Korean Hangul and Han (Hanzi ideographs) scripts, as
20 well as Japanese Kana. It is almost compatible with makeindex and
21 mendex, and additional feature for handling readings of kanji words is
22 also available.
23The formats of the input and output files are specified in a style file. The
24readings of kanji words can be specified in a dictionary file.
25The index can have up to three levels (0, 1, and 2) of subitem nesting.
26
28 -i Take input from stdin, even when index files are specified.
29
30 -l Set ´sort by character order´. By default, ´sort by word
31 order´ is used. Details are described below.
32
33 -q Quiet mode; send no message to stderr, except error messages
34 and warnings.
35
36 -r Disable implicit page range formation. By default, three or
37 more successive pages are automatically abbreviated as a
38 range (e.g. 1–5).
39
40 -c Compress sequence of intermediate blanks (space(s) and/or
41 tab(s)) into a space and ignore leading and trailing
42 blank(s). By default, blanks in the index key are retained.
43
44 -g Make Japanese index head A-line (A, Ka, Sa, ...) of the
45 gojuon table (Japanese syllabary). By default, all the char‐
46 acters in the gojuon table are used.
47
48 -f Force to output characters even if the scripts are not sup‐
49 ported by upmendex.
50
51 -s sty Employ sty as the style file.
52
53 -d dic Employ dic as the dictionary file. The dictionary file is
54 composed of lists of <index_word reading>.
55
56 -o ind Employ ind as the output index file. By default, the file
57 name is created by appending the extension ind to the base
58 name of the first input file.
59
60 -t log Employ log as the transcript file. By default, the file name
61 is created by appending the extension ilg to the base name of
62 the first input file.
63
64 -p no Set the starting page number of the output index list to be
65 no. The argument no may be numerical or one of the following:
66 any (the next page to the end of contents), odd (the next odd
67 page to the end of contents), even (the next even page to the
68 end of contents).
69
70 --help Show summary of options.
71
72 -- Arguments after -- are not taken as options.
73
74
76 The style file informs upmendex about the format of the idx input files
77 and the intended format of the final output file. The format is upper
78 compatible with the one for makeindex and mendex. The style file con‐
79 tains a list of <specifier attribute> pairs. There are two types of
80 specifiers: input and output. Pairs do not have to appear in any par‐
81 ticular order. A line begun by ´%´ is a comment.
82
83
84 Input file style parameter
85
86 keyword <string> "\\indexentry"
87 Command with an argument of index entry.
88
89 arg_open <char> ´{´
90 Opening delimiter which shows the begin‐
91 ning of index entry.
92
93 arg_close <char> ´}´
94 Closing delimiter which shows the end of
95 index entry.
96
97 range_open <char> ´(´
98 Opening delimiter which shows the begin‐
99 ning of page range.
100
101 range_close <char> ´)´
102 Closing delimiter which shows the end of
103 page range.
104
105 level <char> ´!´
106 Delimiter which shows lower level.
107
108 actual <char> ´@´
109 Symbol which shows the next sequence is
110 to appear as index strings in the output
111 file.
112
113 encap <char> ´|´
114 Symbol which shows the next sequence is
115 to be used as command name attached to
116 the page number.
117
118 page_compositor <string> "-"
119 Separator between page levels for a style
120 with multi-levels of page numbers.
121
122 page_precedence <string> "rnaRA"
123 Priority of expression for page number.
124 ´R´ and ´r´ correspond to Roman. ´n´ cor‐
125 responds to arabic numeral. ´A´ and ´a´
126 correspond to Latin alphabet.
127
128 quote <char> ´"´
129 Escape character for upmendex parameters.
130
131 escape <char> ´\\´
132 Escape character for general scripts.
133
134 Output file style parameter
135
136 preamble <string> "\\begin{theindex}\n"
137 Preamble of output file.
138
139 postamble <string> "\n\n\\end{theindex}\n"
140 Postamble of output file.
141
142 setpage_prefix <string> "\n \\setcounter{page}{"
143 Prefix of page number if start page is
144 designated.
145
146 setpage_suffix <string> "}\n"
147 Suffix of page number if start page is
148 designated.
149
150 group_skip <string> "\n\n \\indexspace\n"
151 Strings to insert vertical space before
152 new section of index.
153
154 lethead_prefix <string> ""
155 Prefix of heading for newly appeared
156 heading letter.
157
158 heading_prefix <string> ""
159 Same as lethead_prefix.
160
161 lethead_suffix <string> ""
162 Suffix of heading for newly appeared
163 heading letter.
164
165 heading_suffix <string> ""
166 Same as lethead_suffix.
167
168 lethead_flag <number> 0
169 Flag to control output of heading let‐
170 ters. ´0´, ´1´, ´-1´ and ´2´ respec‐
171 tively denotes no output, uppercase, low‐
172 ercase and titlecase.
173
174 heading_flag <number> 0
175 Same as lethead_flag.
176
177 tumunja <string> "ㄱㄴㄷㄹㅁㅂㅅㅇㅈㅊㅋㅌㅍㅎ"
178 Heading characters of hangul specified by
179 a string. (Extended by upmendex)
180
181 hanzi_head <string> ""
182 Heading strings of hanzi (Kanji, Hanja)
183 specified by a string, which is concate‐
184 nated of items with a separator ´;´.
185 (Extended by upmendex)
186
187 item_0 <string> "\n \\item "
188 Command sequence inserted between primary
189 level entries.
190
191 item_1 <string> "\n \\subitem "
192 Command sequence inserted between sub
193 level entries.
194
195 item_2 <string> "\n \\subsubitem "
196 Command sequence inserted between subsub
197 level entries.
198
199 item_01 <string> "\n \\subitem "
200 Command sequence inserted between primaly
201 and sub level entries.
202
203 item_x1 <string> "\n \\subitem "
204 Command sequence inserted between primary
205 and sub level entries when main entry
206 does not have page number.
207
208 item_12 <string> "\n \\subsubitem "
209 Command sequence inserted between sub and
210 subsub level entries.
211
212 item_x2 <string> "\n \\subsubitem "
213 Command sequence inserted between sub and
214 subsub level entries when sub level entry
215 does not have page number.
216
217 delim_0 <string> ", "
218 Delimiter string between primary level
219 entry and first page number.
220
221 delim_1 <string> ", "
222 Delimiter string between sub level entry
223 and first page number.
224
225 delim_2 <string> ", "
226 Delimiter string between subsub level
227 entry and first page number.
228
229 delim_n <string> ", "
230 Delimiter string between page numbers
231 commonly used for any entry level.
232
233 delim_r <string> "--"
234 Delimiter string between pages to show
235 page range.
236
237 delim_t <string> ""
238 Delimiter string output at the end of
239 page number list.
240
241 suffix_2p <string> ""
242 String to be inserted in place of delim_n
243 and the next page number when the two
244 pages are contiguous.
245 It works only when the parameter is defined.
246
247 suffix_3p <string> ""
248 String to be inserted in place of delim_r
249 and the third page number when the three
250 pages are contiguous. The parameter is
251 prior to suffix_mp.
252 It works only when the parameter is defined.
253
254 suffix_mp <string> ""
255 String to be inserted in place of delim_r
256 and the last page number when the three
257 or more pages are contiguous.
258 It works only when the parameter is defined.
259
260 encap_prefix <string> "\\"
261 Prefix for an encapsulating command when
262 the encapsulating command is added to the
263 page number.
264
265 encap_infix <string> "{"
266 Prefix just before the page number when
267 the encapsulating command is added to the
268 page number.
269
270 encap_suffix <string> "}".
271 Suffix after the page number when the
272 encapsulating command is added to the
273 page number.
274
275 line_max <number> 72
276 Maximum number of one line. If exceed
277 the number, lines are folded.
278
279 indent_space <string> ""
280 Space for indent which inserted to top of
281 folded line.
282
283 indent_length <number> 16
284 Length of space for indent which inserted
285 to top of folded line.
286
287 symhead_positive <string> "Symbols"
288 Strings to output as heading letter for
289 numbers and symbols when lethead_flag or
290 heading_flag is positive number.
291
292 symhead_negative <string> "symbols"
293 Strings to output as heading letter for
294 numbers and symbols when lethead_flag or
295 heading_flag is negative number.
296
297 symbol <string> ""
298 Strings to output as heading letter for
299 numbers and symbols when symbol_flag is
300 non zero.
301 If specified, the option is prior to symhead_positive and symhead_nega‐
302 tive. (Extended by (up)mendex)
303
304 symbol_flag <number> 1
305 Flag to output of symbol. If ´0´, do not
306 output. (Extended by (up)mendex)
307
308 letter_head <number> 1
309 Flag of heading letter for Japanese Kana.
310 If ´1´ and ´2´, Katakana and Hiragana is
311 used, respectively. (Extended by
312 (up)mendex)
313
314 priority <number> 0
315 Flag of sorting method for index words
316 composed of Japanese and non-Japanese
317 (ex. Latin scripts). If non zero, one
318 space (U+20) is inserted between Japanese
319 sequence and non-Japanese sequence in
320 sorting procedure. (Extended by
321 (up)mendex)
322
323 character_order <string> "SNLGCJKH"
324 Order of scripts and symbols. ´S´, ´N´,
325 ´L´, ´G´, ´C´, ´J´, ´K´ and ´H´ respec‐
326 tively denotes symbol, number, Latin,
327 Greek, Cyrillic, Japanese Kana, Korean
328 Hangul and Hanja. (Extended by upmendex)
329
330 icu_locale <string> ""
331 Locale in ICU collator. By default,
332 "root sort order" is set. (Extended by
333 upmendex)
334
335 icu_rules <string> ""
336 Customized collation rules in ICU colla‐
337 tor. Unicode characters in UTF-8 encod‐
338 ing and following escape sequences are
339 accepted: \Uhhhhhhhh (8-digit hexadecimal
340 [0-9A-Fa-f]), \uhhhh (4-digit hexadeci‐
341 mal), \xhh (2-digit hexadecimal),
342 \x{h...} (1..8-digit hexadecimal), and
343 \ooo (3-digit octal [0-7]). By default,
344 locale is used. (Extended by upmendex)
345 Ref. <http://userguide.icu-project.org/collation/customization>,
346 <http://www.unicode.org/reports/tr35/tr35-collation.html#Rules>
347
348 icu_rules <string> ""
349 Attributes in ICU collator. Followings
350 are available: "alternate:shifted",
351 "alternate:non-ignorable", "strength:pri‐
352 mary", "strength:secondary",
353 "strength:tertiary", "strength:quater‐
354 nary", "strength:identical", "french-col‐
355 lation:on", "french-collation:off",
356 "case-first:off", "case-first:upper-
357 first", "case-first:lower-first", "case-
358 level:on", "case-level:off", "normaliza‐
359 tion-mode:on", "normalization-mode:off"
360 (Extended by upmendex)
361 Ref. <http://userguide.icu-project.org/collation/customization>,
362 <http://www.unicode.org/reports/tr35/tr35-collation.html#Set‐
363 ting_Options>
364
366 upmendex has an additional feature to simplify the procedure of han‐
367 dling Japanese indexes, compared to makeindex. Users can save the
368 effort of manually specifying a reading for every kanji word.
369 Japanese kanji words are usually sorted by the syllables of their read‐
370 ings (´Yomi´), which can be represented by kana (Hiragana, Katakana)
371 scripts. upmendex accepts index words specified in kana expression
372 directly on an input file, and also accepts conversion from index words
373 to kana scripts by referring to Japanese dictionaries.
374
375
376 Examples of internal simplification of syllables are shown below.
377
378 かぶしきがいしゃ かふしきかいしや
379 マッキントッシュ まつきんとつしゆ
380 ワープロ わあふろ
381
382 The dictionary file consists of list with <´index_word´ ´reading´>.
383 The index word can be written in any scripts (kanji, kana, etc), and
384 the reading must be in Hiragana or Katakana scripts. The delimiter
385 between the index word and its reading is one or more tab(s) or
386 space(s).
387 An example of a Japanese dictionary is shown below.
388
389 漢字 かんじ
390 読み よみ
391 環境 かんきょう
392 α アルファ
393
394 Here, each index word is allowed to have only one Yomi. Though some
395 kanji words (ex. 「表」) may have more than one Yomi´s (ex. 「ひょう」
396 and 「おもて」), only one of them can be registered in the dictionary.
397 When some different Yomi´s are needed, they should be specified explic‐
398 itly in kana expression (ex. \index{ひょう@表} or \index{おもて@表}) on
399 the input file.
400 Moreover, a dictionary file is automatically referred by setting the
401 file name at an environment variable INDEXDEFAULTDICTIONARY. The dic‐
402 tionary set by the environment variable can be used together with
403 file(s) specified by -d option.
404
406 upmendex sorts indexes as is (´sort by word order´) by default. Set‐
407 ting -l option, spaces between words in an index are truncated prior to
408 sorting procedure (´sort by character order´).
409 Even when sort by character order, the index at output remains the
410 original sequence without the truncation.
411 Follows show an example.
412
413 sort by word order sort by character order
414 X Window Xlib
415 Xlib XView
416 XView X Window
417
418 In addition, two sorting methods can be applied for indexes which con‐
419 tains both Japanese kana and other scripts (e.g. Latin script). By
420 setting priority 0 (default) and 1 at a style file, a space between Ja‐
421 panese Kana and other scripts is inserted and not inserted respec‐
422 tively, prior to the sorting procedure.
423 Follows show an example.
424
425 priority=0 priority=1
426 index sort indファイル
427 indファイル index sort
428
430 upmendex refers environment variables as follows.
431
432 INDEXSTYLE
433 Directory where index style files exist.
434
435 INDEXDEFAULTSTYLE
436 Index style file to be referred to as default.
437
438 INDEXDICTIONARY
439 Directory where dictionary files exist.
440
441 INDEXDEFAULTDICTIONARY
442 Dictionary file which is automatically read.
443
445 Detailed specification is compatible with makeindex.
446
448 When plural page number expression is used, .idx files should be speci‐
449 fied along with the order of page numbers. Otherwise, wrong page num‐
450 bers might be output.
451
453 tex(1), latex(1), makeindex(1), mendex(1).
454 International Components for Unicode (ICU): <http://site.icu-
455 project.org/>
456
458 This manual page was written by Takuji Tanaka based on the mendex man‐
459 ual page written by Japanese TeX Development Community.
460
461
462
463 UPMENDEX(1)