1UPMENDEX(1)                 General Commands Manual                UPMENDEX(1)
2
3
4

NAME

6       upmendex - Multilingual index processor
7

SYNOPSIS

9       upmendex  [-ilqrcgf] [-s sty] [-d dic] [-o ind] [-t log] [-p no] [--] [
10       idx0 idx1 idx2 ...]
11       upmendex --help
12

DESCRIPTION

14       The program upmendex is a general purpose multilingual hierarchical in‐
15       dex  generator  working  with upLaTeX, XeLaTeX and LuaLaTeX; it accepts
16       one or more input files (.idx; often produced by a text formatter  such
17       as  LaTeX  families),  sorts  the  entries, and produces an output file
18       which can be formatted.  It  supports  Latin  (including  non-English),
19       Greek,  Cyrillic,  Korean Hangul and Han (Hanzi ideographs) scripts, as
20       well as Japanese Kana.  It is  almost  compatible  with  makeindex  and
21       mendex,  and additional feature for handling readings of kanji words is
22       also available.
23       The formats of the input and output files  are  specified  in  a  style
24       file.   The  readings  of  kanji words can be specified in a dictionary
25       file.
26       The index can have up to three levels (0, 1, and 2) of subitem nesting.
27

OPTIONS

29       -i        Take input from stdin, even when index files are specified.
30
31       -l        Set ´sort by character order´. By default, ´sort by word  or‐
32                 der´ is used.  Details are described below.
33
34       -q        Quiet  mode; send no message to stderr, except error messages
35                 and warnings.
36
37       -r        Disable implicit page range formation. By default,  three  or
38                 more  successive  pages  are  automatically  abbreviated as a
39                 range (e.g. 1–5).
40
41       -c        Compress sequence of  intermediate  blanks  (space(s)  and/or
42                 tab(s))   into  a  space  and  ignore  leading  and  trailing
43                 blank(s).  By default, blanks in the index key are retained.
44
45       -g        Make Japanese index head A-line (A, Ka, Sa, ...;  10  charac‐
46                 ters)  of  the gojuon table (Japanese syllabary). By default,
47                 all 48 characters in the gojuon table are used.
48
49       -f        Force to output characters even if the scripts are  not  sup‐
50                 ported by upmendex.
51
52       -s sty    Employ sty as the style file.
53
54       -d dic    Employ  dic  as  the  dictionary file. The dictionary file is
55                 composed of lists of <index_word reading>.
56
57       -o ind    Employ ind as the output index file.  By  default,  the  file
58                 name  is  created  by appending the extension ind to the base
59                 name of the first input file.
60
61       -t log    Employ log as the transcript file. By default, the file  name
62                 is created by appending the extension ilg to the base name of
63                 the first input file.
64
65       -p no     Set the starting page number of the output index list  to  be
66                 no. The argument no may be numerical or one of the following:
67                 any (the next page to the end of contents), odd (the next odd
68                 page to the end of contents), even (the next even page to the
69                 end of contents).
70
71       --help    Show summary of options.
72
73       --        Arguments after -- are not taken as options.  This is  useful
74                 when the input file name starts with '-'.
75
76

STYLE FILE

78       The style file informs upmendex about the format of the idx input files
79       and the intended format of the final output file. The format  is  upper
80       compatible  with the one for makeindex and mendex.  The style file con‐
81       tains a list of <specifier attribute> pairs.  There are  two  types  of
82       specifiers:  input and output.  Pairs do not have to appear in any par‐
83       ticular order.  A line begun by ´%´ is a comment.
84
85
86       Input file style parameter
87
88       keyword  <string>             "\\indexentry"
89                                     Command with an argument of  index  entry
90                                     which is going to be processed.
91
92       arg_open  <char>              ´{´
93                                     Opening  delimiter which shows the begin‐
94                                     ning of index entry.
95
96       arg_close  <char>             ´}´
97                                     Closing delimiter which shows the end  of
98                                     index entry.
99
100       range_open  <char>            ´(´
101                                     Opening  delimiter which shows the begin‐
102                                     ning of page range.
103
104       range_close  <char>           ´)´
105                                     Closing delimiter which shows the end  of
106                                     page range.
107
108       level  <char>                 ´!´
109                                     Delimiter which shows lower level.
110
111       actual  <char>                ´@´
112                                     Symbol  which  shows the next sequence is
113                                     to appear as index strings in the  output
114                                     file.
115
116       encap  <char>                 ´|´
117                                     Symbol  which  shows the next sequence is
118                                     to be used as command  name  attached  to
119                                     the page number.
120
121       page_compositor  <string>     "-"
122                                     Separator between page levels for a style
123                                     with multi-levels of page numbers.
124
125       page_precedence  <string>     "rnaRA"
126                                     Priority of expression for  page  number.
127                                     ´R´ and ´r´ correspond to Roman. ´n´ cor‐
128                                     responds to arabic numeral.  ´A´ and  ´a´
129                                     correspond to Latin alphabet.
130
131       quote  <char>                 ´"´
132                                     Escape character for upmendex parameters.
133
134       escape  <char>                ´\\´
135                                     Escape character for general scripts.
136
137       Output file style parameter
138
139       preamble  <string>            "\\begin{theindex}\n"
140                                     Preamble of output file.
141
142       postamble  <string>           "\n\n\\end{theindex}\n"
143                                     Postamble of output file.
144
145       setpage_prefix  <string>      "\n  \\setcounter{page}{"
146                                     Prefix  of  page  number if start page is
147                                     designated.
148
149       setpage_suffix  <string>      "}\n"
150                                     Suffix of page number if  start  page  is
151                                     designated.
152
153       group_skip  <string>          "\n\n  \\indexspace\n"
154                                     Strings  to  insert vertical space before
155                                     new section of index.
156
157       lethead_prefix  <string>      ""
158                                     Prefix  of  heading  for  newly  appeared
159                                     heading letter.
160
161       heading_prefix  <string>      ""
162                                     Same  as lethead_prefix. (compatible with
163                                     makeindex)
164
165       lethead_suffix  <string>      ""
166                                     Suffix  of  heading  for  newly  appeared
167                                     heading letter.
168
169       heading_suffix  <string>      ""
170                                     Same  as lethead_suffix. (compatible with
171                                     makeindex)
172
173       lethead_flag  <number>        0
174                                     Flag to control output of heading letters
175                                     in  Latin,  Greek  and  Cyrillic scripts.
176                                     ´0´, ´1´, ´-1´ and ´2´  respectively  de‐
177                                     notes no output, uppercase, lowercase and
178                                     titlecase.
179
180       heading_flag  <number>        0
181                                     Same as  lethead_flag.  (Note:  makeindex
182                                     uses a different name headings_flag)
183
184       headings_flag  <number>       0
185                                     Same  as  lethead_flag.  (compatible with
186                                     makeindex)
187
188       kana_head  <string>           ""
189                                     Heading characters of Kana specified by a
190                                     string.   By default, it is controlled by
191                                     letter_head and command line  option  -g.
192                                     (Extended by upmendex)
193
194       hangul_head  <string>         "ㄱㄴㄷㄹㅁㅂㅅㅇㅈㅊㅋㅌㅍㅎ"
195                                     Heading characters of Hangul specified by
196                                     a string.  (Extended by upmendex)
197
198       tumunja  <string>             "ㄱㄴㄷㄹㅁㅂㅅㅇㅈㅊㅋㅌㅍㅎ"
199                                     Heading characters of Hangul specified by
200                                     a  string.   (Deprecated, Extended by up‐
201                                     mendex)
202
203       hanzi_head  <string>          ""
204                                     Heading strings of hanzi  (Kanji,  Hanja)
205                                     specified  by a string, which is concate‐
206                                     nated of  items  with  a  separator  ´;´.
207                                     (Extended by upmendex)
208
209       devanagari_head  <string>     "ऄअआइईउऊऋऌऍऎएऐऑऒओऔकखगघङचछजझञटठडढणतथदधनपफबभमयरलळवशषसह"
210                                     Heading characters of  Devanagari  speci‐
211                                     fied  by  a  string.   (Experimental, Ex‐
212                                     tended by upmendex)
213
214       thai_head  <string>           "กขฃคฅฆงจฉชซฌญฎฏฐฑฒณดตถทธนบปผฝพฟภมยรฤลฦวศษสหฬอฮ"
215                                     Heading  characters of Thai script speci‐
216                                     fied by  a  string.   (Experimental,  Ex‐
217                                     tended by upmendex)
218
219       item_0  <string>              "\n  \\item "
220                                     Command sequence inserted between primary
221                                     level entries.
222
223       item_1  <string>              "\n     \\subitem "
224                                     Command  sequence  inserted  between  sub
225                                     level entries.
226
227       item_2  <string>              "\n       \\subsubitem "
228                                     Command  sequence inserted between subsub
229                                     level entries.
230
231       item_01  <string>             "\n    \\subitem "
232                                     Command sequence inserted between primaly
233                                     and sub level entries.
234
235       item_x1  <string>             "\n    \\subitem "
236                                     Command sequence inserted between primary
237                                     and sub level  entries  when  main  entry
238                                     does not have page number.
239
240       item_12  <string>             "\n    \\subsubitem "
241                                     Command sequence inserted between sub and
242                                     subsub level entries.
243
244       item_x2  <string>             "\n    \\subsubitem "
245                                     Command sequence inserted between sub and
246                                     subsub level entries when sub level entry
247                                     does not have page number.
248
249       delim_0  <string>             ", "
250                                     Delimiter string  between  primary  level
251                                     entry and first page number.
252
253       delim_1  <string>             ", "
254                                     Delimiter  string between sub level entry
255                                     and first page number.
256
257       delim_2  <string>             ", "
258                                     Delimiter string between subsub level en‐
259                                     try and first page number.
260
261       delim_n  <string>             ", "
262                                     Delimiter  string  between  page  numbers
263                                     commonly used for any entry level.
264
265       delim_r  <string>             "--"
266                                     Delimiter string between  pages  to  show
267                                     page range.
268
269       delim_t  <string>             ""
270                                     Delimiter  string  output  at  the end of
271                                     page number list.
272
273       suffix_2p  <string>           ""
274                                     String to be inserted in place of delim_n
275                                     and  the  next  page  number when the two
276                                     pages are contiguous.
277       It works only when the parameter is defined.
278
279       suffix_3p  <string>           ""
280                                     String to be inserted in place of delim_r
281                                     and  the third page number when the three
282                                     pages are contiguous.  The  parameter  is
283                                     prior to suffix_mp.
284       It works only when the parameter is defined.
285
286       suffix_mp  <string>           ""
287                                     String to be inserted in place of delim_r
288                                     and the last page number when  the  three
289                                     or more pages are contiguous.
290       It works only when the parameter is defined.
291
292       encap_prefix  <string>        "\\"
293                                     Prefix  for an encapsulating command when
294                                     the encapsulating command is added to the
295                                     page number.
296
297       encap_infix  <string>         "{"
298                                     Prefix  just  before the page number when
299                                     the encapsulating command is added to the
300                                     page number.
301
302       encap_suffix  <string>        "}".
303                                     Suffix after the page number when the en‐
304                                     capsulating command is added to the  page
305                                     number.
306
307       line_max  <number>            72
308                                     Maximum  number  of  one line.  If exceed
309                                     the number, lines are folded.
310
311       indent_space  <string>        ""
312                                     Space for indent which inserted to top of
313                                     folded line.
314
315       indent_length  <number>       16
316                                     Length of space for indent which inserted
317                                     to top of folded line.
318
319       symhead_positive  <string>    "Symbols"
320                                     Strings to output as heading  letter  for
321                                     symbols when lethead_flag or heading_flag
322                                     or headings_flag is positive number.
323
324       symhead_negative  <string>    "symbols"
325                                     Strings to output as heading  letter  for
326                                     symbols when lethead_flag or heading_flag
327                                     or headings_flag is negative number.
328
329       symbol  <string>              ""
330                                     Strings to output as heading  letter  for
331                                     symbols when symbol_flag is non zero.
332       If specified, the option is prior to symhead_positive and symhead_nega‐
333       tive.  (Extended by (up)mendex)
334
335       numhead_positive  <string>    "Numbers"
336                                     Strings to output as heading  letter  for
337                                     numbers when lethead_flag or heading_flag
338                                     or headings_flag is positive  number  and
339                                     symbol_flag is 2.
340
341       numhead_negative  <string>    "numbers"
342                                     Strings  to  output as heading letter for
343                                     numbers when lethead_flag or heading_flag
344                                     or  headings_flag  is negative number and
345                                     symbol_flag is 2.
346
347       symbol_flag  <number>         1
348                                     Flag to output of symbol. If ´0´, do  not
349                                     output  headings for symbols and numbers.
350                                     If ´1´, output symbols and numbers  as  a
351                                     group of symbols.  If ´2´, output symbols
352                                     and  numbers  separately.   (Extended  by
353                                     (up)mendex)
354
355       letter_head  <number>         1
356                                     Flag of heading letter for Japanese Kana.
357                                     If ´1´ and ´2´, Katakana and Hiragana  is
358                                     used,    respectively.     (Extended   by
359                                     (up)mendex)
360
361       priority  <number>            0
362                                     Flag of sorting method  for  index  words
363                                     composed  of  Japanese  and  non-Japanese
364                                     (ex. Latin scripts).  If  non  zero,  one
365                                     space (U+20) is inserted between Japanese
366                                     sequence  and  non-Japanese  sequence  in
367                                     sorting    procedure.     (Extended    by
368                                     (up)mendex)
369
370       character_order  <string>     "SNLGCJKHDTah"
371                                     Order of scripts and symbols.  ´S´,  ´N´,
372                                     ´L´,  ´G´,  ´C´, ´J´, ´K´, ´H´, ´D´, ´T´,
373                                     ´a´ and ´h´ respectively denotes  symbol,
374                                     number,  Latin, Greek, Cyrillic, Japanese
375                                     Kana, Korean Hangul,  Hanja,  Devanagari,
376                                     Thai,  Arabic  and Hebrew script.  Please
377                                     make sure that ´S´ and ´N´  are  next  to
378                                     each  other  if symbol_flag=1, since num‐
379                                     bers are classified as a part of  symbol.
380                                     (Extended by upmendex)
381
382       script_preamble  <string 1>  <string 2>
383                                     ""
384                                     Preamble  of script block in output file,
385                                     specified by string  2.   One  of  script
386                                     names  must be specified in the string 1:
387                                     ´latin´,  ´cyrillic´,  ´greek´,   ´kana´,
388                                     ´hangul´,  ´hanzi´, ´devanagari´, ´thai´,
389                                     ´arabic´, or ´hebrew´.  (Extended by  up‐
390                                     mendex)
391
392       script_postamble  <string 1>  <string 2>
393                                     ""
394                                     Postamble of script block in output file,
395                                     specified by string  2.   One  of  script
396                                     names  must be specified in the string 1:
397                                     ´latin´,  ´cyrillic´,  ´greek´,   ´kana´,
398                                     ´hangul´,  ´hanzi´, ´devanagari´, ´thai´,
399                                     ´arabic´, or ´hebrew´.  (Extended by  up‐
400                                     mendex)
401
402       icu_locale  <string>          ""
403                                     Locale  in  ICU  collator.   By  default,
404                                     "root sort order" is set.   (Extended  by
405                                     upmendex)
406
407       icu_rules  <string>           ""
408                                     Customized  collation rules in ICU colla‐
409                                     tor.  Unicode characters in UTF-8  encod‐
410                                     ing  and  following  escape sequences are
411                                     accepted: \Uhhhhhhhh (8-digit hexadecimal
412                                     [0-9A-Fa-f]),  \uhhhh  (4-digit hexadeci‐
413                                     mal),   \xhh    (2-digit    hexadecimal),
414                                     \x{h...}  (1..8-digit  hexadecimal),  and
415                                     \ooo (3-digit octal [0-7]).  If icu_rules
416                                     and  icu_locale are simultaneously speci‐
417                                     fied,  collation   rules   specified   by
418                                     icu_rules  are  added  on collation rules
419                                     specified by icu_locale.  By default, lo‐
420                                     cale is used.  (Extended by upmendex)
421       Ref.  <https://unicode-org.github.io/icu/userguide/collation/customiza
422       tion/>, <http://www.unicode.org/reports/tr35/tr35-collation.html#Rules>
423
424       icu_attributes  <string>      ""
425                                     Attributes in ICU  collator.   Followings
426                                     are  available: "alternate:shifted", "al‐
427                                     ternate:non-ignorable",    "strength:pri‐
428                                     mary",              "strength:secondary",
429                                     "strength:tertiary",    "strength:quater‐
430                                     nary", "strength:identical", "french-col‐
431                                     lation:on",       "french-collation:off",
432                                     "case-first:off",      "case-first:upper-
433                                     first", "case-first:lower-first",  "case-
434                                     level:on",  "case-level:off", "normaliza‐
435                                     tion-mode:on",  "normalization-mode:off",
436                                     "numeric-ordering:on",    "numeric-order‐
437                                     ing:off" (Extended by upmendex)
438       Ref.  <https://unicode-org.github.io/icu/userguide/collation/customiza
439       tion/#default-options>,  <http://www.unicode.org/reports/tr35/tr35-col
440       lation.html#Setting_Options>
441

ABOUT JAPANESE PROCESSING

443       upmendex has an additional feature to simplify the  procedure  of  han‐
444       dling  Japanese  indexes, compared to makeindex. Users can save the ef‐
445       fort of manually specifying a reading for every kanji word.
446       Japanese kanji words are usually sorted by the syllables of their read‐
447       ings  (´Yomi´),  which  can be represented by kana (Hiragana, Katakana)
448       scripts.  upmendex accepts index words specified in kana expression di‐
449       rectly  on  an input file, and also accepts conversion from index words
450       in Kanji or symbols to phonogram scripts by referring to Japanese  dic‐
451       tionaries.
452
453
454       Examples of internal simplification of syllables are shown below.
455
456              かぶしきがいしゃ         かふしきかいしや
457              マッキントッシュ         まつきんとつしゆ
458              ワープロ            わあふろ
459
460       The  dictionary  file  consists  of list with <´index_word´ ´reading´>.
461       The index word can be written in any scripts (kanji,  kana,  etc),  and
462       the  reading  can  be  in  any  phonograms such as Hiragana or Katakana
463       scripts.  The delimiter between the index word and its reading  is  one
464       or more tab(s) or space(s).
465       An example of a Japanese dictionary is shown below.
466
467              漢字      かんじ
468              読み      よみ
469              環境      かんきょう
470              $        ドル
471
472       Here,  each  index  word is allowed to have only one Yomi.  Though some
473       kanji words (ex. 「表」) may have more than one Yomi´s (ex.  「ひょう」
474       and  「おもて」), only one of them can be registered in the dictionary.
475       When some different Yomi´s are needed, they should be specified explic‐
476       itly in kana expression (ex. \index{ひょう@表} or \index{おもて@表}) on
477       the input file.
478       Moreover, a dictionary file is automatically referred  by  setting  the
479       file  name at an environment variable INDEXDEFAULTDICTIONARY.  The dic‐
480       tionary set by the environment  variable  can  be  used  together  with
481       file(s) specified by -d option.
482

ABOUT SORTING PROCEDURE

484       upmendex  sorts  indexes as is (´sort by word order´) by default.  Set‐
485       ting -l option, spaces between words in an index are truncated prior to
486       sorting procedure (´sort by character order´).
487       Even  when  sort  by  character  order, the index at output remains the
488       original sequence without the truncation.
489       Follows show an example.
490
491              sort by word order       sort by character order
492              X Window            Xlib
493              Xlib                XView
494              XView                    X Window
495
496       In addition, two sorting methods can be applied for indexes which  con‐
497       tains  both  Japanese  kana  and other scripts (e.g. Latin script).  By
498       setting priority 0 (default) and 1 at a style file, a space between Ja‐
499       panese  Kana  and  other  scripts  is inserted and not inserted respec‐
500       tively, prior to the sorting procedure.
501       Follows show an example.
502
503              priority=0               priority=1
504              index sort               indファイル
505              indファイル              index sort
506

ENVIRONMENT VARIABLES

508       upmendex refers environment variables as follows.
509
510       INDEXSTYLE
511                 Directory where index style files exist.
512
513       INDEXDEFAULTSTYLE
514                 Index style file to be referred to as default.
515
516       INDEXDICTIONARY
517                 Directory where dictionary files exist.
518
519       INDEXDEFAULTDICTIONARY
520                 Dictionary file which is automatically read.
521

DETAIL

523       Detailed specification is compatible with makeindex.
524

KNOWN ISSUES

526       When plural page number expression is used, .idx files should be speci‐
527       fied  along  with the order of page numbers. Otherwise, wrong page num‐
528       bers might be output.
529

SEE ALSO

531       tex(1), latex(1), makeindex(1), mendex(1).
532       International Components for Unicode (ICU):  <http://icu.unicode.org/>,
533       <https://unicode-org.github.io/icu/>
534

AUTHOR

536       This  manual page was written by Takuji Tanaka based on the mendex man‐
537       ual page written by Japanese TeX Development Community.
538
539
540
541                                                                   UPMENDEX(1)
Impressum