1NGRAMSHRINK(1) User Commands NGRAMSHRINK(1)
2
3
4
6 ngramshrink - manual page for ngramshrink 1.3.14
7
9 Shrink n-gram model from input model file.
10
11 Usage: ngramshrink [--options] [in.fst [out.fst]]
12
13 PROGRAM FLAGS:
14
15 --backoff_label: type = int64_t, default = 0
16
17 Backoff label --check_consistency: type = bool, default = false
18 Check model consistency --context_pattern: type = std::string,
19 default = "" Pattern of contexts to prune --count_pattern: type
20 = std::string, default = "" Pattern of counts to prune
21 --list_file: type = std::string, default = "" File with list of
22 n-grams to prune --method: type = std::string, default = "sey‐
23 more" One of: "context_prune", "count_prune", "relative_en‐
24 tropy", "seymore", "list_prune" --min_order_to_prune: type =
25 int32_t, default = 2 Minimum n-gram order to prune --norm_eps:
26 type = double, default = 0.001 Normalization check epsilon
27 --retry_downcase: type = bool, default = false If a pruned sym‐
28 bol is not found in the FST, automatically tries the lower-cased
29 variant of this symbol. Only useful in list_prune mode.
30 --shrink_opt: type = int32_t, default = 0 Optimization level:
31 Range 0 (fastest) to 2 (most accurate) --target_num‐
32 ber_of_ngrams: type = int64_t, default = -1 Maximum number of
33 ngrams to leave in model after pruning. Value less than zero
34 means no target number, just use theta. --theta: type = double,
35 default = 0 Pruning threshold theta --total_unigram_count: type
36 = double, default = -1 Total unigram count
37
38 LIBRARY FLAGS:
39
40 Flags from: flags.cc
41
42 --help: type = bool, default = false
43
44 show usage information --helpshort: type = bool, default = false
45 show brief usage information --tmpdir: type = std::string, de‐
46 fault = "/tmp" temporary directory --v: type = int32_t, default
47 = 0 verbosity level
48
49 Flags from: fst.cc
50
51 --fst_align: type = bool, default = false
52
53 Write FST data aligned where appropriate --fst_default_cache_gc:
54 type = bool, default = true Enable garbage collection of cache
55 --fst_default_cache_gc_limit: type = int64_t, default = 1048576
56 Cache byte size that triggers garbage collection
57 --fst_read_mode: type = std::string, default = "read" Default
58 file reading mode for mappable files --fst_verify_properties:
59 type = bool, default = false Verify FST properties queried by
60 TestProperties --save_relabel_ipairs: type = std::string, de‐
61 fault = "" Save input relabel pairs to file --save_rela‐
62 bel_opairs: type = std::string, default = "" Save output relabel
63 pairs to file
64
65 Flags from: ngram-output.cc
66
67 --end_symbol: type = std::string, default = "</s>"
68
69 Class label for sentence end --start_symbol: type = std::string,
70 default = "<s>" Class label for sentence start
71
72 Flags from: symbol-table.cc
73
74 --fst_compat_symbols: type = bool, default = true
75
76 Require symbol tables to match when appropriate --fst_field_sep‐
77 arator: type = std::string, default = " " Set of charac‐
78 ters used as a separator between printed fields
79
80 Flags from: util.cc
81
82 --fst_error_fatal: type = bool, default = true
83
84 FST errors are fatal; o.w. return objects flagged as bad: e.g.,
85 FSTs: kError property set, FST weights: not a Member()
86 --ngram_error_fatal: type = bool, default = true NGram errors
87 are fatal if true; otherwise returns objects flagged as bad:
88 e.g., NGramModel::Error() is true
89
90 Flags from: weight.cc
91
92 --fst_weight_parentheses: type = std::string, default = ""
93
94 Characters enclosing the first weight of a printed composite
95 weight (e.g., pair weight, tuple weight and derived classes) to
96 ensure proper I/O of nested composite weights; must have size 0
97 (none) or 2 (open and close parenthesis) --fst_weight_separator:
98 type = std::string, default = "," Character separator between
99 printed composite weights; must be a single character
100
101
102
103ngramshrink 1.3.14 February 2022 NGRAMSHRINK(1)