1TRIETOOL(1)                 General Commands Manual                TRIETOOL(1)
2
3
4

NAME

6       trietool - trie manipulation tool
7

SYNOPSIS

9       trietool [ options ] trie command arg ...
10

DESCRIPTION

12       trietool  is  the  command-line tool for manipulating double-array trie
13       data.  It can be used to query, add and remove words in a trie.
14
15   The Trie
16       The trie argument specifies the name of the trie to manipulate.  A trie
17       is  stored  in  a  file with `.tri' extension. However, to create a new
18       trie, one needs to prepare a file with `.abm' extension, describing the
19       Unicode  ranges  of alphabet set of the trie.  The ABM defines a set of
20       vectors that map Unicode characters into a continuous  range  of  inte‐
21       gers.   The  mapped  integers will be used as internal alphabet for the
22       trie.  Such mapping can improve the space allocation  within  the  trie
23       data,  regardless of non-continuity of the character set being used, as
24       the mapped range is always continuous.
25
26       The ABM file is a plain text file, with each line listing  a  range  of
27       32-bit Unicodes to be added to the alphabet set, in the format:
28
29              [0xSSSS,0xTTTT]
30
31       where `0xSSSS' and `0xTTTT' are hexadecimal values of starting and end‐
32       ing character code for the range, respectively.
33
34       For example, for a dictionary that contains only English  words  witout
35       any punctuations, one may prepare `trie.abm' as:
36
37              [0x0041,0x005a]
38              [0x0061,0x007a]
39
40       The first line lists the ASCII codes for A-Z, and the second for a-z.
41
42       No more than 255 alphabets are allowed in a trie.
43
44       The  created `.tri' file will incorporate the ABM data.  So, the `.abm'
45       file is not required after the first creation, and will be ignored.
46

COMMANDS

48       Available commands are:
49
50       add word data ...
51              Add word to trie, associated with integer data.  Arbitrary  num‐
52              ber  of  words-data  pairs  can be given.  Two arguments will be
53              read at a time, the first will be treated as word, and the  sec‐
54              ond as data.
55
56       add-list [ options ] list-file
57              Add words with associated data listed in list-file to trie.  The
58              list-file must be a text file listing one word  per  line.   The
59              associated data can be put after the word in the same line, sep‐
60              arated with tab (`\t') character.  If the data field is omitted,
61              a default value (-1) will be used instead.
62
63              Options are available for this command:
64
65              -e, --encoding enc
66                     Specify  character  encoding  of  the list-file contents,
67                     such as `UTF-8'.  If omitted, current locale  codeset  is
68                     assumed.
69
70       delete word ...
71              Delete  word from trie.  Arbitrary number of words to delete can
72              be given.
73
74       delete-list [ options ] list-file
75              Delete words listed in list-file from trie.  The list-file  must
76              be a text file listing one word per line.
77
78              Options are available for this command:
79
80              -e, --encoding enc
81                     Specify  character  encoding  of  the list-file contents,
82                     such as `UTF-8'.  If omitted, current locale  codeset  is
83                     assumed.
84
85       query word
86              Search for word in trie.  If word exists, its associated data is
87              printed to standard output.  Otherwise, error message is printed
88              to standard error, with nothing printed to standard output.
89
90       list   List all words in trie to standard output.  The output lists one
91              word-data pair per line, separated with  tab  (`\t')  character,
92              the format appropriate for being list-file for the add-list com‐
93              mand.
94

OPTIONS

96       This program follows the usual  GNU  command  line  syntax,  with  long
97       options  starting  with  two  dashes  (`--').   A summary of options is
98       included below.
99
100       -p, --path dir
101              Set trie directory to dir [default=`.']
102
103       -h, --help
104              Show summary of options.
105
106       -V, --version
107              Show version of program.
108

AUTHOR

110       libdatrie was written by Theppitak Karoonboonyanan.
111
112       This manual page was  written  by  Theppitak  Karoonboonyanan  <theppi‐
113       tak@gmail.com>.
114
115
116
117                                 DECEMBER 2008                     TRIETOOL(1)
Impressum