1UPMENDEX(1)                 General Commands Manual                UPMENDEX(1)
2
3
4

NAME

6       upmendex - Multilingual index processor
7

SYNOPSIS

9       upmendex  [-ilqrcgf] [-s sty] [-d dic] [-o ind] [-t log] [-p no] [--] [
10       idx0 idx1 idx2 ...]
11       upmendex --help
12

DESCRIPTION

14       The program upmendex is a general purpose multilingual hierarchical in‐
15       dex  generator  working  with upLaTeX, XeLaTeX and LuaLaTeX; it accepts
16       one or more input files (.idx; often produced by a text formatter  such
17       as  LaTeX  families),  sorts  the  entries, and produces an output file
18       which can be formatted.  It  supports  Latin  (including  non-English),
19       Greek,  Cyrillic,  Korean Hangul and Han (Hanzi ideographs) scripts, as
20       well as Japanese Kana.  It is  almost  compatible  with  makeindex  and
21       mendex,  and additional feature for handling readings of kanji words is
22       also available.
23       The formats of the input and output files  are  specified  in  a  style
24       file.   The  readings  of  kanji words can be specified in a dictionary
25       file.
26       The index can have up to three levels (0, 1, and 2) of subitem nesting.
27

OPTIONS

29       -i        Take input from stdin, even when index files are specified.
30
31       -l        Set ´sort by character order´. By default, ´sort by word  or‐
32                 der´ is used.  Details are described below.
33
34       -q        Quiet  mode; send no message to stderr, except error messages
35                 and warnings.
36
37       -r        Disable implicit page range formation. By default,  three  or
38                 more  successive  pages  are  automatically  abbreviated as a
39                 range (e.g. 1–5).
40
41       -c        Compress sequence of  intermediate  blanks  (space(s)  and/or
42                 tab(s))   into  a  space  and  ignore  leading  and  trailing
43                 blank(s).  By default, blanks in the index key are retained.
44
45       -g        Make Japanese index head A-line (A, Ka, Sa, ...;  10  charac‐
46                 ters)  of  the gojuon table (Japanese syllabary). By default,
47                 all 48 characters in the gojuon table are used.
48
49       -f        Force to output characters even if the scripts are  not  sup‐
50                 ported by upmendex.
51
52       -s sty    Employ sty as the style file.
53
54       -d dic    Employ  dic  as  the  dictionary file. The dictionary file is
55                 composed of lists of <index_word reading>.
56
57       -o ind    Employ ind as the output index file.  By  default,  the  file
58                 name  is  created  by appending the extension ind to the base
59                 name of the first input file.
60
61       -t log    Employ log as the transcript file. By default, the file  name
62                 is created by appending the extension ilg to the base name of
63                 the first input file.
64
65       -p no     Set the starting page number of the output index list  to  be
66                 no. The argument no may be numerical or one of the following:
67                 any (the next page to the end of contents), odd (the next odd
68                 page to the end of contents), even (the next even page to the
69                 end of contents).
70
71       --help    Show summary of options.
72
73       --        Arguments after -- are not taken as options.  This is  useful
74                 when the input file name starts with '-'.
75
76

STYLE FILE

78       The style file informs upmendex about the format of the idx input files
79       and the intended format of the final output file. The format  is  upper
80       compatible  with the one for makeindex and mendex.  The style file con‐
81       tains a list of <specifier attribute> pairs.  There are  two  types  of
82       specifiers:  input and output.  Pairs do not have to appear in any par‐
83       ticular order.  A line begun by ´%´ is a comment.
84
85
86       Input file style parameter
87
88       keyword  <string>             "\\indexentry"
89                                     Command with an argument of  index  entry
90                                     which is going to be processed.
91
92       arg_open  <char>              ´{´
93                                     Opening  delimiter which shows the begin‐
94                                     ning of index entry.
95
96       arg_close  <char>             ´}´
97                                     Closing delimiter which shows the end  of
98                                     index entry.
99
100       range_open  <char>            ´(´
101                                     Opening  delimiter which shows the begin‐
102                                     ning of page range.
103
104       range_close  <char>           ´)´
105                                     Closing delimiter which shows the end  of
106                                     page range.
107
108       level  <char>                 ´!´
109                                     Delimiter which shows lower level.
110
111       actual  <char>                ´@´
112                                     Symbol  which  shows the next sequence is
113                                     to appear as index strings in the  output
114                                     file.
115
116       encap  <char>                 ´|´
117                                     Symbol  which  shows the next sequence is
118                                     to be used as command  name  attached  to
119                                     the page number.
120
121       page_compositor  <string>     "-"
122                                     Separator between page levels for a style
123                                     with multi-levels of page numbers.
124
125       page_precedence  <string>     "rnaRA"
126                                     Priority of expression for  page  number.
127                                     ´R´ and ´r´ correspond to Roman. ´n´ cor‐
128                                     responds to arabic numeral.  ´A´ and  ´a´
129                                     correspond to Latin alphabet.
130
131       quote  <char>                 ´"´
132                                     Escape character for upmendex parameters.
133
134       escape  <char>                ´\\´
135                                     Escape character for general scripts.
136
137       Output file style parameter
138
139       preamble  <string>            "\\begin{theindex}\n"
140                                     Preamble of output file.
141
142       postamble  <string>           "\n\n\\end{theindex}\n"
143                                     Postamble of output file.
144
145       setpage_prefix  <string>      "\n  \\setcounter{page}{"
146                                     Prefix  of  page  number if start page is
147                                     designated.
148
149       setpage_suffix  <string>      "}\n"
150                                     Suffix of page number if  start  page  is
151                                     designated.
152
153       group_skip  <string>          "\n\n  \\indexspace\n"
154                                     Strings  to  insert vertical space before
155                                     new section of index.
156
157       lethead_prefix  <string>      ""
158                                     Prefix  of  heading  for  newly  appeared
159                                     heading letter.
160
161       heading_prefix  <string>      ""
162                                     Same  as lethead_prefix. (compatible with
163                                     makeindex)
164
165       lethead_suffix  <string>      ""
166                                     Suffix  of  heading  for  newly  appeared
167                                     heading letter.
168
169       heading_suffix  <string>      ""
170                                     Same  as lethead_suffix. (compatible with
171                                     makeindex)
172
173       lethead_flag  <number>        0
174                                     Flag to control output of heading letters
175                                     in  Latin,  Greek  and  Cyrillic scripts.
176                                     ´0´, ´1´, ´-1´ and ´2´  respectively  de‐
177                                     notes no output, uppercase, lowercase and
178                                     titlecase.
179
180       heading_flag  <number>        0
181                                     Same as  lethead_flag.  (Note:  makeindex
182                                     uses a different name headings_flag)
183
184       headings_flag  <number>       0
185                                     Same  as  lethead_flag.  (compatible with
186                                     makeindex)
187
188       kana_head  <string>           ""
189                                     Heading characters of Kana specified by a
190                                     string.   By default, it is controlled by
191                                     letter_head and command line  option  -g.
192                                     (Extended by upmendex)
193
194       hangul_head  <string>         "ㄱㄴㄷㄹㅁㅂㅅㅇㅈㅊㅋㅌㅍㅎ"
195                                     Heading characters of Hangul specified by
196                                     a string.  (Extended by upmendex)
197
198       tumunja  <string>             "ㄱㄴㄷㄹㅁㅂㅅㅇㅈㅊㅋㅌㅍㅎ"
199                                     Heading characters of Hangul specified by
200                                     a  string.   (Deprecated, Extended by up‐
201                                     mendex)
202
203       hanzi_head  <string>          ""
204                                     Heading strings of hanzi  (Kanji,  Hanja)
205                                     specified  by a string, which is concate‐
206                                     nated of  items  with  a  separator  ´;´.
207                                     (Extended by upmendex)
208
209       devanagari_head  <string>     "ऄअआइईउऊऋऌऍऎएऐऑऒओऔकखगघङचछजझञटठडढणतथदधनपफबभमयरलळवशषसह"
210                                     Heading characters of  Devanagari  speci‐
211                                     fied  by  a  string.   (Experimental, Ex‐
212                                     tended by upmendex)
213
214       thai_head  <string>           "กขฃคฅฆงจฉชซฌญฎฏฐฑฒณดตถทธนบปผฝพฟภมยรฤลฦวศษสหฬอฮ"
215                                     Heading  characters of Thai script speci‐
216                                     fied by  a  string.   (Experimental,  Ex‐
217                                     tended by upmendex)
218
219       item_0  <string>              "\n  \\item "
220                                     Command sequence inserted between primary
221                                     level entries.
222
223       item_1  <string>              "\n     \\subitem "
224                                     Command  sequence  inserted  between  sub
225                                     level entries.
226
227       item_2  <string>              "\n       \\subsubitem "
228                                     Command  sequence inserted between subsub
229                                     level entries.
230
231       item_01  <string>             "\n    \\subitem "
232                                     Command sequence inserted between primaly
233                                     and sub level entries.
234
235       item_x1  <string>             "\n    \\subitem "
236                                     Command sequence inserted between primary
237                                     and sub level  entries  when  main  entry
238                                     does not have page number.
239
240       item_12  <string>             "\n    \\subsubitem "
241                                     Command sequence inserted between sub and
242                                     subsub level entries.
243
244       item_x2  <string>             "\n    \\subsubitem "
245                                     Command sequence inserted between sub and
246                                     subsub level entries when sub level entry
247                                     does not have page number.
248
249       delim_0  <string>             ", "
250                                     Delimiter string  between  primary  level
251                                     entry and first page number.
252
253       delim_1  <string>             ", "
254                                     Delimiter  string between sub level entry
255                                     and first page number.
256
257       delim_2  <string>             ", "
258                                     Delimiter string between subsub level en‐
259                                     try and first page number.
260
261       delim_n  <string>             ", "
262                                     Delimiter  string  between  page  numbers
263                                     commonly used for any entry level.
264
265       delim_r  <string>             "--"
266                                     Delimiter string between  pages  to  show
267                                     page range.
268
269       delim_t  <string>             ""
270                                     Delimiter  string  output  at  the end of
271                                     page number list.
272
273       suffix_2p  <string>           ""
274                                     String to be inserted in place of delim_n
275                                     and  the  next  page  number when the two
276                                     pages are contiguous.
277       It works only when the parameter is defined.
278
279       suffix_3p  <string>           ""
280                                     String to be inserted in place of delim_r
281                                     and  the third page number when the three
282                                     pages are contiguous.  The  parameter  is
283                                     prior to suffix_mp.
284       It works only when the parameter is defined.
285
286       suffix_mp  <string>           ""
287                                     String to be inserted in place of delim_r
288                                     and the last page number when  the  three
289                                     or more pages are contiguous.
290       It works only when the parameter is defined.
291
292       encap_prefix  <string>        "\\"
293                                     Prefix  for an encapsulating command when
294                                     the encapsulating command is added to the
295                                     page number.
296
297       encap_infix  <string>         "{"
298                                     Prefix  just  before the page number when
299                                     the encapsulating command is added to the
300                                     page number.
301
302       encap_suffix  <string>        "}".
303                                     Suffix after the page number when the en‐
304                                     capsulating command is added to the  page
305                                     number.
306
307       line_max  <number>            72
308                                     Maximum  number  of  one line.  If exceed
309                                     the number, lines are folded.
310
311       indent_space  <string>        ""
312                                     Space for indent which inserted to top of
313                                     folded line.
314
315       indent_length  <number>       16
316                                     Length of space for indent which inserted
317                                     to top of folded line.
318
319       symhead_positive  <string>    "Symbols"
320                                     Strings to output as heading  letter  for
321                                     symbols when lethead_flag or heading_flag
322                                     or headings_flag is positive number.
323
324       symhead_negative  <string>    "symbols"
325                                     Strings to output as heading  letter  for
326                                     symbols when lethead_flag or heading_flag
327                                     or headings_flag is negative number.
328
329       symbol  <string>              ""
330                                     Strings to output as heading  letter  for
331                                     symbols when symbol_flag is non zero.
332       If specified, the option is prior to symhead_positive and symhead_nega‐
333       tive.  (Extended by (up)mendex)
334
335       numhead_positive  <string>    "Numbers"
336                                     Strings to output as heading  letter  for
337                                     numbers when lethead_flag or heading_flag
338                                     or headings_flag is positive  number  and
339                                     symbol_flag is 2.
340
341       numhead_negative  <string>    "numbers"
342                                     Strings  to  output as heading letter for
343                                     numbers when lethead_flag or heading_flag
344                                     or  headings_flag  is negative number and
345                                     symbol_flag is 2.
346
347       symbol_flag  <number>         1
348                                     Flag to output of symbol. If ´0´, do  not
349                                     output  headings for symbols and numbers.
350                                     If ´1´, output symbols and numbers  as  a
351                                     group of symbols.  If ´2´, output symbols
352                                     and  numbers  separately.   (Extended  by
353                                     (up)mendex)
354
355       letter_head  <number>         1
356                                     Flag of heading letter for Japanese Kana.
357                                     If ´1´ and ´2´, Katakana and Hiragana  is
358                                     used,    respectively.     (Extended   by
359                                     (up)mendex)
360
361       priority  <number>            0
362                                     Flag of sorting method  for  index  words
363                                     composed  of  Japanese  and  non-Japanese
364                                     (ex. Latin scripts).  If  non  zero,  one
365                                     space  (U+0020) is inserted between Japa‐
366                                     nese sequence and  non-Japanese  sequence
367                                     in   sorting   procedure.   (Extended  by
368                                     (up)mendex)
369
370       character_order  <string>     "SNLGCJKHDTah"
371                                     Order of scripts and symbols.  ´S´,  ´N´,
372                                     ´L´,  ´G´,  ´C´, ´J´, ´K´, ´H´, ´D´, ´T´,
373                                     ´a´ and ´h´ respectively denotes  symbol,
374                                     number,  Latin, Greek, Cyrillic, Japanese
375                                     Kana, Korean Hangul,  Hanzi,  Devanagari,
376                                     Thai,  Arabic and Hebrew script.  ´@´ de‐
377                                     notes scripts which  are  not  explicitly
378                                     designated  and  the order are configured
379                                     by icu_rules or icu_locale.  Please  make
380                                     sure  that  ´S´  and ´N´ are next to each
381                                     other if symbol_flag=1, since numbers are
382                                     classified  as  a  part  of symbol.  (Ex‐
383                                     tended by upmendex)
384
385       script_preamble  <string 1>  <string 2>
386                                     ""
387                                     Preamble of script block in output  file,
388                                     specified  by  string  2.   One of script
389                                     names must be specified in the string  1:
390                                     ´latin´,   ´cyrillic´,  ´greek´,  ´kana´,
391                                     ´hangul´, ´hanzi´, ´devanagari´,  ´thai´,
392                                     ´arabic´,  or ´hebrew´.  (Extended by up‐
393                                     mendex)
394
395       script_postamble  <string 1>  <string 2>
396                                     ""
397                                     Postamble of script block in output file,
398                                     specified  by  string  2.   One of script
399                                     names must be specified in the string  1:
400                                     ´latin´,   ´cyrillic´,  ´greek´,  ´kana´,
401                                     ´hangul´, ´hanzi´, ´devanagari´,  ´thai´,
402                                     ´arabic´,  or ´hebrew´.  (Extended by up‐
403                                     mendex)
404
405       icu_locale  <string>          ""
406                                     Locale  in  ICU  collator.   By  default,
407                                     "root  sort  order" is set.  (Extended by
408                                     upmendex)
409
410       icu_rules  <string>           ""
411                                     Customized collation rules in ICU  colla‐
412                                     tor.   Unicode characters in UTF-8 encod‐
413                                     ing and following  escape  sequences  are
414                                     accepted: \Uhhhhhhhh (8-digit hexadecimal
415                                     [0-9A-Fa-f]), \uhhhh  (4-digit  hexadeci‐
416                                     mal),    \xhh    (2-digit   hexadecimal),
417                                     \x{h...}  (1..8-digit  hexadecimal),  and
418                                     \ooo (3-digit octal [0-7]).  If icu_rules
419                                     and icu_locale are simultaneously  speci‐
420                                     fied,   collation   rules   specified  by
421                                     icu_rules are added  on  collation  rules
422                                     specified by icu_locale.  By default, lo‐
423                                     cale is used.  (Extended by upmendex)
424       Ref.  <https://unicode-org.github.io/icu/userguide/collation/customiza
425       tion/>, <http://www.unicode.org/reports/tr35/tr35-collation.html#Rules>
426
427       icu_attributes  <string>      ""
428                                     Attributes  in  ICU collator.  Followings
429                                     are available: "alternate:shifted",  "al‐
430                                     ternate:non-ignorable",    "strength:pri‐
431                                     mary",              "strength:secondary",
432                                     "strength:tertiary",    "strength:quater‐
433                                     nary", "strength:identical", "french-col‐
434                                     lation:on",       "french-collation:off",
435                                     "case-first:off",      "case-first:upper-
436                                     first",  "case-first:lower-first", "case-
437                                     level:on", "case-level:off",  "normaliza‐
438                                     tion-mode:on",  "normalization-mode:off",
439                                     "numeric-ordering:on",    "numeric-order‐
440                                     ing:off" (Extended by upmendex)
441       Ref.  <https://unicode-org.github.io/icu/userguide/collation/customiza
442       tion/#default-options>,  <http://www.unicode.org/reports/tr35/tr35-col
443       lation.html#Setting_Options>
444

ABOUT JAPANESE PROCESSING

446       upmendex  has  an  additional feature to simplify the procedure of han‐
447       dling Japanese indexes, compared to makeindex. Users can save  the  ef‐
448       fort of manually specifying a reading for every kanji word.
449       Japanese kanji words are usually sorted by the syllables of their read‐
450       ings (´Yomi´), which can be represented by  kana  (Hiragana,  Katakana)
451       scripts.  upmendex accepts index words specified in kana expression di‐
452       rectly on an input file, and also accepts conversion from  index  words
453       in  Kanji or symbols to phonogram scripts by referring to Japanese dic‐
454       tionaries.
455
456
457       Examples of internal simplification of syllables are shown below.
458
459              かぶしきがいしゃ         かふしきかいしや
460              マッキントッシュ         まつきんとつしゆ
461              ワープロ            わあふろ
462
463       The dictionary file consists of  list  with  <´index_word´  ´reading´>.
464       The  index  word  can be written in any scripts (kanji, kana, etc), and
465       the reading can be in any  phonograms  such  as  Hiragana  or  Katakana
466       scripts.   The  delimiter between the index word and its reading is one
467       or more tab(s) or space(s).
468       An example of a Japanese dictionary is shown below.
469
470              漢字      かんじ
471              読み      よみ
472              環境      かんきょう
473              $        ドル
474
475       Here, each index word is allowed to have only one  Yomi.   Though  some
476       kanji  words (ex. 「表」) may have more than one Yomi´s (ex. 「ひょう」
477       and 「おもて」), only one of them can be registered in the  dictionary.
478       When some different Yomi´s are needed, they should be specified explic‐
479       itly in kana expression (ex. \index{ひょう@表} or \index{おもて@表}) on
480       the input file.
481       Moreover,  a  dictionary  file is automatically referred by setting the
482       file name at an environment variable INDEXDEFAULTDICTIONARY.  The  dic‐
483       tionary  set  by  the  environment  variable  can be used together with
484       file(s) specified by -d option.
485

ABOUT SORTING PROCEDURE

487       upmendex sorts indexes as is (´sort by word order´) by  default.   Set‐
488       ting -l option, spaces between words in an index are truncated prior to
489       sorting procedure (´sort by character order´).
490       Even when sort by character order, the  index  at  output  remains  the
491       original sequence without the truncation.
492       Follows show an example.
493
494              sort by word order       sort by character order
495              X Window            Xlib
496              Xlib                XView
497              XView                    X Window
498
499       In  addition, two sorting methods can be applied for indexes which con‐
500       tains both Japanese kana and other scripts  (e.g.  Latin  script).   By
501       setting priority 0 (default) and 1 at a style file, a space between Ja‐
502       panese Kana and other scripts is  inserted  and  not  inserted  respec‐
503       tively, prior to the sorting procedure.
504       Follows show an example.
505
506              priority=0               priority=1
507              index sort               indファイル
508              indファイル              index sort
509

ENVIRONMENT VARIABLES

511       upmendex refers environment variables as follows.
512
513       INDEXSTYLE
514                 Directory where index style files exist.
515
516       INDEXDEFAULTSTYLE
517                 Index style file to be referred to as default.
518
519       INDEXDICTIONARY
520                 Directory where dictionary files exist.
521
522       INDEXDEFAULTDICTIONARY
523                 Dictionary file which is automatically read.
524

DETAIL

526       Detailed specification is compatible with makeindex.
527

KNOWN ISSUES

529       When plural page number expression is used, .idx files should be speci‐
530       fied along with the order of page numbers. Otherwise, wrong  page  num‐
531       bers might be output.
532

SEE ALSO

534       tex(1), latex(1), makeindex(1), mendex(1).
535       International  Components for Unicode (ICU): <http://icu.unicode.org/>,
536       <https://unicode-org.github.io/icu/>
537

AUTHOR

539       This manual page was written by Takuji Tanaka based on the mendex  man‐
540       ual page written by Japanese TeX Development Community.
541
542
543
544                                                                   UPMENDEX(1)
Impressum